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Chapter 1 


Preliminaries 


We start with a brief overview of mathematical logic as covered in this course. 
Next we review some basic notions from elementary set theory, which provides 
a medium for communicating mathematics in a precise and clear way. In this 
course we develop mathematical logic using elementary set theory as given, 
just as one would do with other branches of mathematics, like group theory or 
probability theory. 

For more on the course material, see 


Shoenfield, J. R., Mathematical Logic, Reading, Addison-Wesley, 1967. 
For additional material in Model Theory we refer the reader to 


Chang, C. C. and Keisler, H. J., Model Theory, New York, North- 
Holland, 1990, 


Poizat, B., A Course in Model Theory, Springer, 2000, 


and for additional material on Computability, to 


Rogers, H., Theory of Recursive Functions and Effective Com- 
putability, McGraw-Hill, 1967. 


1.1 Mathematical Logic: a brief overview 


Aristotle identified some simple patterns in human reasoning, and Leibniz dreamt 
of reducing reasoning to calculation. As a viable mathematical subject, however, 
logic is relatively recent: the 19th century pioneers were Bolzano, Boole, Cantor, 
Dedekind, Frege, Peano, C.S. Peirce, and E. Schroder. From our perspective 
we see their work as leading to boolean algebra, set theory, propositional logic, 
predicate logic, as clarifying the foundations of the natural and real number 
systems, and as introducing suggestive symbolic notation for logical operations. 
Also, their activity led to the view that logic + set theory can serve as a basis for 
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all of mathematics. This era did not produce theorems in mathematical logic 
of any real depth, ' but it did bring crucial progress of a conceptual nature, 
and the recognition that logic as used in mathematics obeys mathematical rules 
that can be made fully explicit. 

In the period 1900-1950 important new ideas came from Russell, Zermelo, 
Hausdorff, Hilbert, Lowenheim, Ramsey, Skolem, Lusin, Post, Herbrand, Godel, 
Tarski, Church, Kleene, Turing, and Gentzen. They discovered the first real 
theorems in mathematical logic, with those of Godel having a dramatic impact. 
Hilbert (in Géttingen), Lusin (in Moscow), Tarski (in Warsaw and Berkeley), 
and Church (in Princeton) had many students and collaborators, who made up 
a large part of that generation and the next in mathematical logic. Most of 
these names will be encountered again during the course. 

The early part of the 20th century was also marked by the so-called 


foundational crisis in mathematics. 


A strong impulse for developing mathematical logic came from the attempts 
during these times to provide solid foundations for mathematics. Mathematical 
logic has now taken on a life of its own, and also thrives on many interactions 
with other areas of mathematics and computer science. 

In the second half of the last century, logic as pursued by mathematicians 
gradually branched into four main areas: model theory, computability theory (or 
recursion theory), set theory, and proof theory. The topics in this course are 
part of the common background of mathematicians active in any of these areas. 


What distinguishes mathematical logic within mathematics is that 
statements about mathematical objects and_ structures 


are taken seriously as mathematical objects in their own right. More generally, 
in mathematical logic we formalize (formulate in a precise mathematical way) 
notions used informally by mathematicians such as: 


e property 

e statement (in a given language) 

e structure 

e truth (what it means for a given statement to be true in a given structure) 
e proof (from a given set of axioms) 


e algorithm 


1Jn the case of set theory one could dispute this. Cantor’s discoveries were profound, but 
even so, the main influence of set theory on the rest of mathematics was to enable simple 
constructions of great generality, like cartesian products, quotient sets and power sets, and 
this involves only very elementary set theory. 
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Once we have mathematical definitions of these notions, we can try to prove 
theorems about these formalized notions. If done with imagination, this process 
can lead to unexpected rewards. Of course, formalization tends to caricature 
the informal concepts it aims to capture, but no harm is done if this is kept 
firmly in mind. 


Example. The notorious Goldbach Conjecture asserts that every even integer 
greater than 2 is a sum of two prime numbers. With the understanding that 
the variables range over N = {0,1,2,...}, and that 0,1,+,-,< denote the 
usual arithmetic operations and relations on N, this assertion can be expressed 
formally as 


(GC) Val(1+1 < aAeven(x)) + Apig¢(prime(p) A prime(q) Ax = p+q)| 


where even(x) abbreviates dy(a = y + y) and prime(p) abbreviates 
1<pAVrVs(p=r-s—> (r=1Vs=1)). 


The expression GC is an example of a formal statement (also called a sentence) 
in the language of arithmetic, which has symbols 0, 1,+,-, < to denote arithmetic 
operations and relations, in addition to logical symbols like =,A,V,7,—,V, J, 
and variables x, y, z,p,q,1, 8. 

The Goldbach Conjecture asserts that this particular sentence GC is true in 
the structure (N; 0,1,+,-,<). (No proof of the Goldbach Conjecture is known.) 
It also makes sense to ask whether the sentence GC is true in the structure 


(R; 0, 1,+, i) <) 


where now the variables range over R and 0,1,+,-,< have their natural ‘real’ 
meanings. (It’s not, as is easily verified. That the question makes sense —has 
a yes or no answer—does not mean that it is of any interest.) 

A century of experience gives us confidence that all classical number-theoretic 
results—old or new, proved by elementary methods or by sophisticated algebra 
and analysis—can be proved from the Peano axioms for arithmetic. 7 However, 
in our present state of knowledge, GC might be true in (N; 0,1,+,-,<), but not 
provable from those axioms. (On the other hand, once you know what exactly 
we mean by 

provable from the Peano axioms, 


you will see that if GC is provable from those axioms, then GC is true in 
(N; 0,1,+,-,<), and that if GC is false in (N; 0,1,+,-,<), then its negation 
«GC is provable from those axioms.) 

The point of this example is simply to make the reader aware of the notions 
“true in a given structure” and “provable from a given set of axioms,” and their 
difference. One objective of this course is to figure out the connections (and 
disconnections) between these notions. 


?Here we do not count as part of classical number theory some results like Ramsey’s 
Theorem that can be stated in the language of arithmetic, but are arguably more in the spirit 
of logic and combinatorics. 
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Some highlights (1900-1950) 


The results below are among the most frequently used facts of mathematical 
logic. The terminology used in stating these results might be unfamiliar, but 
that should change during the course. What matters is to get some preliminary 
idea of what we are aiming for. As will become clear during the course, each of 
these results has stronger versions, on which applications often depend, but in 
this overview we prefer simple statements over strength and applicability. 

We begin with two results that are fundamental in model theory. They 
concern the notion of model of & where ¥ is a set of sentences in a language 
L. At this stage we only say by way of explanation that a model of & is a 
mathematical structure in which all sentences of 4 are true. For example, if U 
is the (infinite) set of axioms for fields of characteristic zero in the language of 
rings, then a model of © is just a field of characteristic zero. 


Theorem of Lo6wenheim and Skolem. [f= is a countable set of sentences 
in some language and % has a model, then & has a countable model. 


Compactness Theorem (Gédel, Mal’cev). Let © be a set of sentences in some 
language. Then & has a model if and only if each finite subset of has a model. 


The next result goes a little beyond model theory by relating the notion of 
“model of %” to that of “provability from X”: 


Completeness Theorem (Godel, 1930). Let 5 be a set of sentences in some 
language L, and let o be a sentence in L. Then oa is provable from & if and only 
if o is true in all models of X. 


In our treatment we shall obtain the first two theorems as byproducts of the 
Completeness Theorem and its proof. In the case of the Compactness Theorem 
this reflects history, but the theorem of L6wenheim and Skolem predates the 
Completeness Theorem. The Loéwenheim-Skolem and Compactness theorems 
do not mention the notion of provability, and thus model theorists often prefer 
to bypass Completeness in establishing these results; see for example Poizat’s 
book. 


Here is an important early result on a specific arithmetic structure: 


Theorem of Presburger and Skolem. Each sentence in the language of the 
structure (Z; 0,1,+,—,<) that is true in this structure is provable from the 
axioms for ordered abelian groups with least positive element 1, augmented, for 
each n = 2,3,4,..., by an axiom that says that for every a there is ab such that 
a=nbora=nb+ lor... ora=nb+1+---4+1 (with n disjuncts in total). 
Moreover, there is an algorithm that, given any sentence in this language as 
input, decides whether this sentence is true in (Z; 0,1,+,—,<). 


Note that in (Z; 0,1,+,—,<) we have not included multiplication among the 
primitives; accordingly, nb stands for b+ ---+ 6 (with n summands). 
When we do include multiplication, the situation changes dramatically: 
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Incompleteness and undecidability of arithmetic. (Gédel-Church, 1930’s). 
One can construct a sentence in the language of arithmetic that is true in the 
structure (N; 0,1,+,-,<), but not provable from the Peano axioms. 

There is no algorithm that, given any sentence in this language as input, 
decides whether this sentence is true in (N; 0,1,4+,-,<). 


Here “there is no algorithm” is used in the mathematical sense of 
there cannot exist an algorithm, 


not in the weaker colloquial sense of “no algorithm is known.” This theorem 
is intimately connected with the clarification of notions like computability and 
algorithm in which Turing played a key role. 


In contrast to these incompleteness and undecidability results on (sufficiently 
rich) arithmetic, we have 


Tarski’s theorem on the field of real numbers (1930-1950). Every sentence 
in the language of arithmetic that is true in the structure 


(R; 0, 1, +3" <) 


is provable from the axioms for ordered fields augmented by the axioms 

- every positive element is a square, 

- every odd degree polynomial has a zero. 
There is also an algorithm that decides for any given sentence in this language 
as input, whether this sentence is true in (R; 0,1,+,-,<). 


1.2 Sets and Maps 


We shall use this section as an opportunity to fix notations and terminologies 
that are used throughout these notes, and throughout mathematics. In a few 
places we shall need more set theory than we introduce here, for example, or- 
dinals and cardinals. The following little book is a good place to read about 
these matters. (It also contains an axiomatic treatment of set theory starting 
from scratch.) 


Halmos, P. R., Naive set theory, New York, Springer, 1974 


In an axiomatic treatment of set theory as in the book by Halmos all assertions 
about sets below are proved from a few simple axioms. In such a treatment the 
notion of set itself is left undefined, but the axioms about sets are suggested 
by thinking of a set as a collection of mathematical objects, called its elements 
or members. To indicate that an object x is an element of the set A we write 
x € A, in words: x isin A (or: x belongs to A). To indicate that x is not in A we 
write « ¢ A. We consider the sets A and B as the same set (notation: A = B) 
if and only if they have exactly the same elements. We often introduce a set 
via the bracket notation, listing or indicating inside the brackets its elements. 
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For example, {1, 2,7} is the set with 1, 2, and 7 as its only elements. Note that 
{1, 2,7} = {2,7,1}, and {3,3} = {3}: the same set can be described in many 
different ways. Don’t confuse an object x with the set {a} that has a as its 
only element: for example, the object « = {0,1} is a set that has exactly two 
elements, namely 0 and 1, but the set {x} = {{0,1}} has only one element, 
namely 2x. 

Here are some important sets that the reader has probably encountered 
previously. 


Examples. 

(1) The empty set: @ (it has no elements). 

(2) The set of natural numbers: N = {0,1,2,3,...}. 
(3) The set of integers: Z = {...,—2,—1,0,1,2,...}. 
(4) The set of rational numbers: Q. 

(5) The set of real numbers: R. 

(6) The set of complex numbers: C. 


Remark. Throughout these notes m and n always denote natural numbers. 
For example, “for all m ...” will mean “for all m € N...”. 


If all elements of the set A are in the set B, then we say that A is a subset of B 
(and write A C B). Thus the empty set @ is a subset of every set, and each set 
is a subset of itself. We often introduce a set A in our discussions by defining 
A to be the set of all elements of a given set B that satisfy some property P. 
Notation: 


A:={x€B: x satisfies P} (hence A C B). 


Let A and B be sets. Then we can form the following sets: 


a) AUB:={x:«a2€Aoraz€ B} (union of A and B); 

b) ANB:= {a :a2€Aand «x € B} (intersection of A and B); 

c) AN B:={xa:a2¢€Aand az ¢ B} (difference of A and B); 

d) Ax B:={(a,b) : a€ A and be B} (cartesian product of A and B). 


Thus the elements of A x B are the so-called ordered pairs (a,b) with a € A 
and 6 € B. The key property of ordered pairs is that we have (a,b) = (c,d) if 
and only if a =c and b = d. For example, you may think of R x R as the set 
of points (a,b) in the xy-plane of coordinate geometry. 

We say that A and B are disjoint if AN B = 9, that is, they have no element 
in common. 


Remark. In a definition such as we just gave: “We say that --- if —,” the 
meaning of “if” is actually “if and only if.” We committed a similar abuse 
of language earlier in defining set inclusion by the phrase “If —, then we say 
that ---.” We shall continue such abuse, in accordance with tradition, but 
only in similarly worded definitions. Also, we shall often write “iff” or “S” to 
abbreviate “if and only if.” 
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Maps 


Definition. A map is a triple f = (A, B,T) of sets A, B,T such that TC Ax B 
and for each a € A there is exactly one b € B with (a,b) € T; we write f(a) for 
this unique b, and call it the value of f at a (or the image of a under f).> We 
call A the domain of f, and B the codomain of f, and I the graph of f.4 We 
write f : A— B to indicate that f is a map with domain A and codomain B, 
and in this situation we also say that f is a map from A to B. 


Among the many synonyms of map are 
mapping, assignment, function, operator, transformation. 


Typically, “function” is used when the codomain is a set of numbers of some 
kind, “operator” when the elements of domain and codomain are themselves 
functions, and “transformation” is used in geometric situations where domain 
and codomain are equal. (We use equal as synonym for the same or identical; 
also coincide is a synonym for being the same.) 


Examples. 

(1) Given any set A we have the identity map 14 : A > A defined by 14(a) =a 
for alla € A. 

(2) Any polynomial f(X) = a9 +a,X +--+: +a,X” with real coefficients 
ag,---,;@n gives rise to a function « 4 f(z): R— R. We often use the 


“maps to” symbol +> in this way to indicate the rule by which to each x 
in the domain we associate its value f(x). 


Definition. Given f: A Bandg: B+ C we haveamapgof:A>C 
defined by (g° f)(a) = g(f(a)) for all a € A. It is called the composition of g 
and f. 


Definition. Let f : A— B bea map. It is said to be injective if for all a, F ag 
in A we have f(a1) 4 f(a). It is said to be surjective if for each b € B there 
exists a € A such that f(a) = b. It is said to be bijective (or a bijection) if it is 
both injective and surjective. For X C A we put 


f(X):={f(a): ce X}CB (direct image of X under f). 


(There is a notational conflict here when X is both a subset of A and an element 
of A, but it will always be clear from the context when f(X) is meant to be the 
the direct image of X under f; some authors resolve the conflict by denoting this 
direct image by f[X] or in some other way.) We also call f(A) = {f(a):a€ A} 
the image of f. For Y C B we put 


fUY) :={weA: f(z) eV} CA (inverse image of Y under f). 


Thus surjectivity of our map f is equivalent to f(A) = B. 


3Sometimes we shall write fa instead of f(a) in order to cut down on parentheses. 
“Other words for “domain” and “codomain” are “source” and “target”, respectively. 
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If f : A— Bis a bijection then we have an inverse map f—! : B > A given by 
f—'(b) := the unique a € A such that f(a) = b. 


Note that then f~!o f =1, and fo f-! =1,. Conversely, if f : 4 > B and 
g:B- Asatisfy go f =1,4 and fog = 1g, then f is a bijection with f—! = g. 
(The attentive reader will notice that we just introduced a potential conflict of 
notation: for bijective f : A — Band Y C B, both the inverse image of Y 
under f and the direct image of Y under f~! are denoted by f~1(Y); no harm 
is done, since these two subsets of A coincide.) 

It follows from the definition of “map” that f:A— Band g:C-— D are 
equal (f = g) if and only if A= C, B = D, and f(x) = g(a) for all x € A. We 
say that g:C —> D extends f: A> B if ACC, BCD, and f(x) = g(a) for 
allze A. ° 


Definition. A set A is said to be finite if there exists n and a bijection 
f:{1,...,n} 3 A. 


Here we use {1,...,n} as a suggestive notation for the set {m:1<m < n}. 
For n = 0 this is just 0. If A is finite there is exactly one such n (although if 
n > 1 there will be more than one bijection f : {1,...,n} — A); we call this 
unique n the number of elements of A or the cardinality of A, and denote it by 
|A|. A set which is not finite is said to be infinite. 


Definition. A set A is said to be countably infinite if there is a bijection N > A. 
It is said to be countable if it is either finite or countably infinite. 


Example. The sets N, Z and Q are countably infinite, but the infinite set R 
is not countably infinite. Every infinite set has a countably infinite subset. 


One of the standard axioms of set theory, the Power Set Axiom says: 
For any set A, there is a set whose elements are exactly the subsets of A. 


Such a set of subsets of A is clearly uniquely determined by A, is denoted 
by P(A), and is called the power set of A. If A is finite, so is P(A) and 
|P(A)| = 2/4!. Note that at {a} : A > P(A) is an injective map. However, 
there is no surjective map A > P(A): 


Cantor’s Theorem. Let 5: A-— P(A) be a map. Then the set 
{ac A:a¢ S(a)} (a subset of A) 


is not an element of S(A). 


Proof. Suppose otherwise. Then {a € A: a ¢ S(a)} = S(b) where b € A. 
Assuming b € S(b) yields b ¢ S(b), a contradiction. Thus b ¢ $(b); but then 
b € S(b), again a contradiction. This concludes the proof. 


5We also say “g: C — D is an extension of f : A B” or “f : A— B is a restriction of 
g:C> D.” 


1.2. SETS AND MAPS 9 


Let J and A be sets. Then there is a set whose elements are exactly the maps 
f :I— A, and this set is denoted by A’. For J = {1,...,n} we also write A” 
instead of A’. Thus an element of A” is a map a: {1,...,n} — A; we usually 
think of such an a as the n-tuple (a(1),...,a(n)), and we often write a; instead 
of a(z). So A” can be thought of as the set of n-tuples (a,,...,@,) with each 
a; € A. For n = 0 the set A” has just one element — the empty tuple. 

An n-ary relation on A is just a subset of A”, and an n-ary operation on 
A is a map from A” into A. Instead of “l-ary” we usually say “unary”, and 
instead of “2-ary” we can say “binary”. For example, {(a,b) € Z?: a < b} isa 
binary relation on Z, and integer addition is the binary operation (a,b) 4 a+b 
on Z. 


Definition. {a;}ic7 or (a;)ier denotes a family of objects a; indexed by the set 
I, and is just a suggestive notation for a set {(i,a;) : i € I}, not to be confused 
with the set {a; : i € I}. (There may be repetitions in the family, that is, it 
may happen that a; = a; for distinct indices 7,7 € J, but such repetition is not 
reflected in the set {a; : i € I}. For example, if J = N and a, = a for all n, then 
{(i,a;) : 7 © I} = {(i,a) : 1 © N} is countably infinite, but {a; : i € I} = {a} 
has just one element.) For J = N we usually say “sequence” instead of “family”. 


Given any family (A;)ie7 of sets (that is, each A; is a set) we have a set 
J Ai = {x:ax€ A; for somei € I}, 
ie 


the union of the family, or, more informally, the union of the sets A;. If I is 
finite and each A, is finite, then so is the union above and 


| U Aj] < a | Ai]. 
i€l i€l 
If J is countable and each A; is countable then Wie , Ai is countable. 
Given any family (A;)ic; of sets we have a set 
][4: := {(aj)ier : a; € A; for all i € T}, 
i€l 


the product of the family. One axiom of set theory, the Axiom of Choice, is a 
bit special, but we shall use it a few times. It says that for any family (A;)jec7 
of nonempty sets there is a family (a;)je7 such that a; € A; for all i € J, that 


is, [lier A; # 0. 


Words 
Definition. Let A be a set. Think of A as an alphabet of letters. A word of 
length n on A is an n-tuple (a1,...,@n) of letters a; € A; because we think 


of it as a word (string of letters) we shall write this tuple instead as a1... ay, 
(without parentheses or commas). There is a unique word of length 0 on A, the 
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empty word and written ¢«. Given a word a = a,...a@, of length n > 1 on A, 
the first letter (or first symbol) of a is by definition a1, and the last letter (or 
last symbol) of a is a,. The set of all words on A is denoted A*: 


A* = U A” (disjoint union). 


Logical expressions like formulas and terms will be introduced later as words of 
a special form on suitable alphabets. When A C B we can identify A* with a 
subset of B*, and this will be done whenever convenient. 


Definition. Given words a = a,...@, and b = b,...b, on A of length m and 
n respectively, we define their concatenation ab € A*: 


ab = a,...Amb1... bn. 


Thus ab is a word on A of length m+n. Concatenation is a binary operation 
on A* that is associative: (ab)c = a(bc) for all a,b,c € A*, with € as two-sided 
identity: «a = a = ae for all a € A*, and with two-sided cancellation: for all 
a,b,c € A*, if ab = ac, then b = c, and if ac = bc, then a = b. 


Equivalence Relations and Quotient Sets 


Given a binary relation R on a set A it is often more suggestive to write aRb 
instead of (a,b) € R. 


Definition. An equivalence relation on a set A is a binary relation ~ on A such 
that for all a,b,c € A: 

(i) a~a (reflexivity); 

(ii) a~b implies b ~ a (symmetry); 

(iii) (a ~ b and b~ c) implies a ~ c (transitivity). 


Example. Given any n we have the equivalence relation “congruence modulo 
n” on Z defined as follows: for any a,b € Z we have 

a=b modn => a—b=nce for some c€ Z. 
For n = 0 this is just equality on Z. 


Let ~ be an equivalence relation on the set A. The equivalence class a~ of an 
element a € A is defined by a~ = {b € A : a~ b} (a subset of A). For a,be€ A 
we have a~ = b~ if and only if a~ b, and a~ Nb~ = O if and only if a ~ b. The 
quotient set of A by ~ is by definition the set of equivalence classes: 


A/~ = {a~ : a€ A}. 


This quotient set is a partition of A, that is, it is a collection of pairwise disjoint 
nonempty subsets of A whose union is A. (Collection is a synonym for set; we 
use it here because we don’t like to say “set of ... subsets ...”.) Every partition 
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of A is the quotient set A/~ for a unique equivalence relation ~ on A. Thus 
equivalence relations on A and partitions of A are just different ways to describe 
the same situation. 

In the previous example (congruence modulo n) the equivalence classes are 
called congruence classes modulo n (or residue classes modulo n) and the cor- 
responding quotient set is often denoted Z/nZ. 


Remark. Readers familiar with some abstract algebra will note that the con- 
struction in the example above is a special case of a more general construction— 
that of a quotient of a group with respect to a normal subgroup. 


Posets 


A partially ordered set (short: poset) is a pair (P,<) consisting of a set P and 
a partial ordering < on P, that is, < is a binary relation on P such that for all 
P,g7 © P: 


(i) p < p (reflexivity); 
(ii) if p< q and q < p, then p = q (antisymmetry); 
(iii) if p< q and q <r, then p < r (transitivity). 
If in addition we have for all p,q € P, 
(iv) p< qorq<p, 


then we say that < is a linear order on P, or that (P,<) is a linearly ordered 
set.© Each of the sets N, Z,Q,R comes with its familiar linear order on it. 


As an example, take any set A and its collection P(A) of subsets. Then 
X<Y:<=> X CY (for subsets X,Y of A) 


defines a poset (P(A), <), also referred to as the power set of A ordered by 
inclusion. This is not a linearly ordered set if A has more than one element. 


Finite linearly ordered sets are determined “up to unique isomorphism” by their 
size: if (P,<) is a linearly ordered set and |P| =n, then there is a unique map 
u: P — {1,...,n} such that for all p,q € P we have: p< q <=> u(p) < cq). 
This map ¢ is a bijection. 


Let (P,<) be a poset. Here is some useful notation. For x,y € P we set 
Try: Sy<yg, 
u<y :Ssyr>usar<yandrFy. 


Note that (P, >) is also a poset. A least element of P is a p € P such that p < x 
for all x € P; a largest element of P is defined likewise, with > instead of <. 


6One also uses the term total order instead of linear order. 
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Of course, P can have at most one least element; therefore we can refer to the 
least element of P, if P has a least element; likewise, we can refer to the largest 
element of P, if P has a largest element. 

A minimal element of P is a p € P such that there is no x € P with x < p; 
a maximal element of P is defined likewise, with > instead of <. If P has a 
least element, then this element is also the unique minimal element of P; some 
posets, however, have more than one minimal element. The reader might want 
to prove the following result to get a feeling for these notions: 


If P is finite and nonempty, then P has a maximal element, and there is a linear 
order <' on P that extends < in the sense that 


psq = ps<'q, forallp,qeP. 


(Hint: use induction on |P].) 


Let X C P. A lowerbound (respectively, upperbound) of X in P is an element | € 
P (respectively, an element u € P), such that | < x for all x € X (respectively, 
x <u for all x € X). 

We often tacitly consider X as a poset in its own right, by restricting the 
given partial ordering of P to X. More precisely this means that we consider 
the poset (X,<x) where the partial ordering <x on X is defined by 


u<xy — «K<y (x,y € X). 


Thus we can speak of least, largest, minimal, and maximal elements of a set 

X C P, when the ambient poset (P,<) is clear from the context. For example, 

when X is the collection of nonempty subsets of a set A and X is ordered by 

inclusion, then the minimal elements of X are the singletons {a} with a € A. 
We call X a chain in P if (X,<x) is linearly ordered. 


Occasionally we shall use the following fact about posets (P,<). 


Zorn’s Lemma. Suppose P is nonempty and every nonempty chain in P has 
an upperbound in P. Then P has a maximal element. 


For a further discussion of Zorn’s Lemma and its proof using the Axiom of 
Choice we refer the reader to Halmos’s book on set theory. 


Chapter 2 


Basic Concepts of Logic 


2.1 Propositional Logic 


Propositional logic is the fragment of logic where new statements are built from 
given statements using so-called connectives like “not”, “or” and “and”. The 
truth value of such a new statement is then completely determined by the truth 
values of the given statements. Thus, given any statements p and q, we can 
form the three statements 


6 


ap (the negation of p, pronounced as “not p”), 


‘ 


‘p or q”), 
pAq (the conjunction of p and q, pronounced as “p and q”). 


pVq (the disjunction of p and q, pronounced as 


This leads to more complicated combinations like a(p A (=9)). We shall regard 
ap as true if and only if p is not true; also, p V q is defined to be true if and 
only if p is true or q is true (including the possibility that both are true), and 
pq is deemed to be true if and only if p is true and q is true. Instead of “not 
true” we also say “false”. We now introduce a formalism that makes this into 
mathematics. 


We start with the five distinct symbols 
alr a ms Vv A 


to be thought of as true, false, not, or, and and, respectively. These symbols are 
fixed throughout the course, and are called propositional connectives. In this 
section we also fix a set A whose elements will be called propositional atoms (or 
just atoms), such that no propositional connective is an atom. It may help the 
reader to think of an atom a as a variable for which we can substitute arbitrary 
statements, assumed to be either true or false. 

A proposition on A is a word on the alphabet AU {T,1,7,V,A} that can 
be obtained by applying the following rules: 


13 
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(i) each atom a € A (viewed as a word of length 1) is a proposition on A; 

(ii) T and L (viewed as words of length 1) are propositions on A; 

(iii) if p and q are propositions on A, then the concatenations =p, Vpq and Apq 
are propositions on A. 


For the rest of this section “proposition” means “proposition on A”, and p,q,r 
(sometimes with subscripts) will denote propositions. 


Example. Suppose a,b,c are atoms. Then A V -ab-c is a proposition. This 
follows from the rules above: a is a proposition, so sa is a proposition, hence 
V=aab as well; also —c is a proposition, and thus A V mab—c is a proposition. 


We defined “proposition” using the suggestive but vague phrase “can be ob- 
tained by applying the following rules”. The reader should take such an infor- 
mal description as shorthand for a completely explicit definition, which in the 
case at hand is as follows: 


A proposition is a word w on the alphabet AU {T,1,7,V,A} for which there 
is a sequence W1,..., Wy of words on that same alphabet, with n > 1, such that 
w = wy and for each k € {1,...,n}, either w, € AU{T, L} (where each element 
in the last set is viewed as a word of length 1), or there are i,j € {1,...,k—1} 
such that wz is one of the concatenations sw;, Vwiw;, Awiw;. 

We let Prop(A) denote the set of propositions. 


Remark. Having the connectives V and A in front of the propositions they 
“connect” rather than in between, is called prefix notation or Polish notation. 
This is theoretically elegant, but for the sake of readability we usually write pV q 
and p/q to denote Vpq and /Apq respectively, and we also use parentheses and 
brackets if this helps to clarify the structure of a proposition. So the proposition 
in the example above could be denoted by [(a)Vb]A (=c), or even by (=aVb)A7c 
since we shall agree that — binds stronger than V and A in this informal way 
of indicating propositions. Because of the informal nature of these conventions, 
we don’t have to give precise rules for their use; it’s enough that each actual 
use is clear to the reader. 

The intended structure of a proposition—how we think of it as built up 
from atoms via connectives—is best exhibited in the form of a tree, a two- 
dimensional array, rather than as a one-dimensional string. Such trees, however, 
occupy valuable space on the printed page, and are typographically demanding. 
Fortunately, our “official” prefix notation does uniquely determine the intended 
structure of a proposition: that is what the next lemma amounts to. 


Lemma 2.1.1 (Unique Readability). If p has length 1, then either p= T, or 
p=H, orp is an atom. If p has length > 1, then its first symbol is either 7, 
or V, or A. If the first symbol of p is a, then p = 7q for a unique q. If the first 
symbol of p is V, then p = Var for a unique pair (q,r). If the first symbol of p 
is A, then p = Aqr for a unique pair (q,7r). 


(Note that we used here our convention that p,q,r denote propositions.) Only 
the last two claims are worth proving in print, the others should require only 
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a moment’s thought. For now we shall assume this lemma without proof. At 
the end of this section we establish more general results of this kind which are 
needed also later in the course. 


Remark. Rather than thinking of a proposition as a statement, it’s better 
viewed as a function whose arguments and values are statements: replacing the 
atoms in a proposition by specific mathematical statements like “2 x 2 = 4”, 
“2 < 7”, and “every even integer > 2 is the sum of two prime numbers”, we 
obtain again a mathematical statement. 


We shall use the following notational conventions: p — q denotes =p V q, and 
p< q denotes (p > g) A (q — p). By recursion on n we define 


all ifn =0 
_ J) pt ifn=1 
Dy Meee Dis = p1 V po ifn =2 


(pi V...-VPn-1)V Pn ifn >2 


Thus pV qV r stands for (pV q) Vr. We call p1 V...V pn the disjunction of 
Pi,---;Pn- The reason that for n = 0 we take this disjunction to be L is that 
we want a disjunction to be true iff (at least) one of the disjuncts is true. 

Similarly, the conjunction py A...A pn of pi,.--,;Pn is defined by replacing 
everywhere V by A and L by T in the definition of p, V...V pn. 


Definition. A truth assignment is a map t: A — {0,1}. We extend such a t 
to t: Prop(A) — {0,1} by requiring 

(i) e(T)=1, tL) =0, 

(ii) (sp) =1-tp), ; ee 

(ili) t(p V q) = max(t(p),t(q)), t(pA gq) = min(¢(p), t(q)). 


Note that there is exactly one such extension t by unique readability. To simplify 
notation we often write t instead of ¢. The array below is called a truth table. 
It shows on each row below the top row how the two leftmost entries ¢t(p) and 
t(q) determine t(-p), t(p V q), t(p Aq), tp > q) and t(p © q). 


P|@| P| PpVa|pAd|p>ad\| preg 
0; 0; 1 0 0 1 1 
Oo;1]1 1 0 1 0 
1}|0] 0 1 0 0 0 
1/1) 0 1 1 1 1 
Let t: A > {0,1}. Note that t(p — q) = 1 if and only if t(p) < t(q), and that 


t(p © q) =1 if and only if t(p) = t(q). 

Suppose aj,...,@, are the distinct atoms that occur in p, and we know how 
p is built up from those atoms. Then we can compute in a finite number of steps 
t(p) from t(a1),...,¢(@»). In particular, t(p) = t’(p) for any t’ : A > {0,1} such 
that t(a;) =t'(a;) fori =1,...,n. 
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Definition. We say that p is a tautology (notation: | p) if t(p) = 1 for all 
t: A— {0,1}. We say that p is satisfiable if t(p) = 1 for some t : A > {0,1}. 


Thus T is a tautology, and pV 7p, p — (pV q) are tautologies for all p and 
q. By the remark preceding the definition one can verify whether any given p 
with exactly n distinct atoms in it is a tautology by computing 2” numbers and 
checking that these numbers all come out 1. (To do this accurately by hand is 
already cumbersome for n = 5, but computers can handle somewhat larger n. 
Fortunately, other methods are often efficient for special cases.) 


Remark. Note that — p © q iff t(p) = t(q) for allt: A — {0,1}. We call p 
equivalent to q if EF p 4 q. Note that “equivalent to” defines an equivalence 
relation on Prop(A). The lemma below gives a useful list of equivalences. We 
leave it to the reader to verify them. 


Lemma 2.1.2. For all p,q,r we have the following equivalences: 


(1) EF(pVp) ep, FE (pA p) + p, 

(2) F(pVg)<(qVp), F (pA gq) + (q Ap), 

(3) F(pV(aqVr)) @ ((pVq)Vr), FE (pA (qAr)) @ (pPAQAr), 

(4) F(pV(qAr)) @ (PVG A(pVr), FE (pA (qVr)) @ (pAQV (pAr), 
(5) F(pV(pAqg) op, E (pA (pV q)) oP, 

(6) F (AV q)) & (pA-4), F ((p A q)) & (-pV -9), 

(7) E(pV-p) eT, F (pA ap) & 1, 

(8) E Ap © p. 

Items (1), (2), (3), (4), (5), and (6) are often referred to as the idempotent 


law, commutativity, associativity, distributivity, the absorption law, and the De 
Morgan law, respectively. Note the left-right symmetry in (1)-(7) : the so-called 
duality of propositional logic. We shall return to this issue in the more algebraic 
setting of boolean algebras. 

Some notation: let (p;);cy be a family of propositions with finite index set 
I, choose a bijection k + i(k): {1,...,n} > I and set 


VV Pi = Pitt) V-*+* V Pin); \\ Pi = Pia) A+++ A Pi(n)- 
wel tel 


If J is clear from context we just write \/;p; and /\,; p; instead. Of course, the 
notations /,-, pi and /\,-,;pi can only be used when the particular choice of 
bijection of {1,...,n} with I does not matter; this is usually the case, because 
the equivalence class of pj(1) V +++ V pi(n) does not depend on this choice, and 
the same is true for the equivalence class of pj(1) A+++ A Pi(n)- 


a 


Next we define “model of ©” and “tautological consequence of %”. 


Definition. Let © C Prop(A). By a model of © we mean a truth assignment 
t: A — {0,1} such that t(p) = 1 for all p € X. We say that a proposition p is 
a tautological consequence of & (written & | p) if t(p) = 1 for every model t of 
x. Note that = p is the same as —) 5 p. 


Lemma 2.1.3. Let © C Prop(A) and p,q € Prop(A). Then 
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1) SEpAq = YEpanddVeg, 

2) SEp = UEpVg, 

3) UU{p} Eq = UEpg, 

4) if SEpand& Ep—gq, then © — q (Modus Ponens). 


Proof. We will prove (3) here and leave the rest as exercise. 
(=) Assume © U {p} — g. To derive © / p — q we consider any model 
t: A —> {0,1} of 5, and need only show that then t(p > q) = 1. If t(p) =1 
then t(X U {p}) C {1}, hence t(q) = 1 and thus t(p > q) = 1. If t(p) = 0 then 
t(p + q) =1 by definition. 

(<=) Assume © E p > gq. To derive © U {p} - q we consider any model 
t: A — {0,1} of EU {p}, and need only derive that t(q) = 1. By assumption 
t(p > q) = 1 and in view of t(p) = 1, this gives t(q) = 1 as required. 


We finish this section with the promised general result on unique readability. 
We also establish facts of similar nature that are needed later. 


Definition. Let F be a set of symbols with a function a: F — N (called the 

arity function). A symbol f € F is said to have arity n if a(f) =n. A word on 

F is said to be admissible if it can be obtained by applying the following rules: 

(i) If f € F has arity 0, then f viewed as a word of length 1 is admissible. 

(ii) If f € F has arity m > 0 and t1,...,tm are admissible words on F’, then 
the concatenation ft, ...t,, is admissible. 


Below we just write “admissible word” instead of “admissible word on F”. Note 
that the empty word is not admissible, and that the last symbol of an admissible 
word cannot be of arity > 0. 


Example. Take F = AU{T,1,7,V,A} and define arity : F > N by 
arity(x) =0 fora € AU{T,1L}, arity(4) =1,  arity(V) = arity(A) = 2. 
Then the set of admissible words is just Prop(A). 


Lemma 2.1.4. Let ti,...,tm and uy,...,Un be admissible words and w any 
word on F such that t,...tmw = U,...Un. Thenm <n, t; = uj fori = 
1,...,m, and w= Um+41'°'Un- 


Proof. By induction on the length of u,...un. If this length is 0, then m =n = 
0 and w is the empty word. Suppose the length is > 0, and assume the lemma 
holds for smaller lengths. Note that n > 0. If m = 0, then the conclusion of the 
lemma holds, so suppose m > 0. The first symbol of t; equals the first symbol 
of u;. Say this first symbol is h € F' with arity k. Then t; = ha,...a, and 
uy, = hb,... bg where aj,...,a, and b),...,b, are admissible words. Cancelling 
the first symbol h gives 


a1... Aztg...tmw = bd)... bpug...Un. 


(Caution: any of k,m—1,n—1 could be 0.) We have length(b;, ... byug... tn) = 
length(ui...tn) — 1, so the induction hypothesis applies. It yields k +m—1< 
k+n-1 (som < n), ay = b1,...,a% = by (so ty = U1), to = Ugssseytmy = Um; 
and w = Um+41'''Un- 
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Here are two immediate consequences that we shall use: 


1. Let t1,...,tm and uy,...,Un be admissible words such that ty...tm = 
U,...Un. Then m =n and t; = u; fori =1,...,m. 


2. Let t and u be admissible words and w a word on F such that tw = u. 
Then ¢ = u and w is the empty word. 


Lemma 2.1.5 (Unique Readability). 
Each admissible word equals ft,...tm for a unique tuple (f,t1,...,tm) where 
f © F has arity m and t1,...,tm are admissible words. 


Proof. Suppose fti...tm = gu1...Un where f,g € F have arity m and n 
respectively, and t1,...,tm,U1,---,Un are admissible words on F’. We have to 
show that then f = g, m =n and t; = u; fori = 1,...,m. Observe first that 
f =g since f and g are the first symbols of two equal words. After cancelling 
the first symbol of both words, the first consequence of the previous lemma leads 
to the desired conclusion. 


Given words v,w € F* and i € {1,...,length(w)}, we say that v occurs in 
w at starting position i if w = w vw where w1,w2 € F* and wy, has length 
i—1. (For example, if f,g € F are distinct, then the word fgf has exactly 
two occurrences in the word fgfgf, one at starting position 1, and the other 
at starting position 3; these two occurrences overlap, but such overlapping is 
impossible with admissible words, see exercise 5 at the end of this section.) 
Given w = w ,vW2 as above, and given v’ € F*, the result of replacing v in w at 
starting position i by v' is by definition the word wy ,v'w. 


Lemma 2.1.6. Let w be an admissible word and 1 <i < length(w). Then there 
is a unique admissible word that occurs in w at starting position %. 


Proof. We prove existence by induction on length(w). Uniqueness then follows 
from the fact stated just before Lemma 2.1.5. Clearly w is an admissible word 
occurring in w at starting position 1. Suppose 7 > 1. Then we write w = 
fti...t, where f € F has arity n > 0, and t),...,t, are admissible words, and 
we take 7 € {1,...,n} such that 


1+ length(t,) + --- + length(t;-1) <7 < 1+ length(t,) +---+ length(¢;). 


Now apply the inductive assumption to t;. 


Remark. Let w = ft,...t, where f € F has arity n > 0, and ¢;,...,t, are 
admissible words. Put J; := 1+length(t,) +---+length(t;) for 7 =0,...,n (so 
lo = 1). Suppose 1j-1 <i <1j;,1< 7 <n, and let v be the admissible word 
that occurs in w at starting position 7. Then the proof of the last lemma shows 
that this occurrence is entirely inside t;, that is, i — 1+ length(v) <1). 


Corollary 2.1.7. Let w be an admissible word and 1 <i < length(w). Then 
the result of replacing the admissible word v in w at starting position i by an 
admissible word v' is again an admissible word. 
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This follows by a routine induction on length(w), using the last remark. 


Exercises. In the exercises below, A = {a1,..., an}, |A] =n. 


(1) (Disjunctive Normal Form) Each p is equivalent to a disjunction 


where each disjunct p; is a conjunction af! A... Aas” with all «; € {—1,1} and 


where for an atom a we put a! :=a and a! := 71a. 


(2) (Conjunctive Normal Form) Same as last problem, except that the signs V and A 
are interchanged, as well as the words “disjunction” and “conjunction,” and also 
the words “disjunct” and “conjunct.” 


(3) To each p associate the function fp : {0,1}4 > {0,1} defined by f,(t) = t(p). 
(Think of a truth table for p where the 2” rows correspond to the 2” truth 
assignments t : A — {0,1}, and the column under p records the values t(p).) 
Then for every function f : {0,1}4 — {0,1} there is a p such that f = fp. 


(4) Let ~ be the equivalence relation on Prop(A) given by 
prg => peg 


Then the quotient set Prop(A)/~ is finite; determine its cardinality as a function 
of n = |Al. 


(5) Let w be an admissible word and 1 < i < i’ < length(w). Let v and v’ be the 
admissible words that occur at starting positions i and 7’ respectively in w. Then 
these occurrences are either nonoverlapping, that is, i—1-+length(v) < 7’, or the 
occurrence of v’ is entirely inside that of v, that is, 


i’ —1+4+length(v’) <i—1+4 length(v). 


2.2 Completeness for Propositional Logic 


In this section we introduce a proof system for propositional logic, state the 
completeness of this proof system, and then prove this completeness. 

As in the previous section we fix a set A of atoms, and the conventions of 
that section remain in force. 


A propositional axiom is by definition a proposition that occurs in the list below, 
for some choice of p,q,r: 


Le 


2. p— (pV q); p— (qV>p) 


ew 


. ap (-q > (pv q) 


4. (pAq) > D; (p\q) > 4 


Or 


. p> (qa (pAq) 
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6. (p> (a7) > (P79) > (> 7) 
7. p> (ap 1) 


8. (ap L) > p 


Each of items 2-8 describes infinitely many propositional axioms. That is why 
we do not call these items azioms, but aziom schemes. For example, if a,b € A, 
then a > (a V L) and b > (6 V (=a A 7b)) are distinct propositional axioms, 
and both instances of axiom scheme 2. It is easy to check that all propositional 
axioms are tautologies. 


Here is our single rule of inference for propositional logic: 
Modus Ponens (MP): from p and p — q, infer gq. 
In the rest of this section © denotes a set of propositions, that is, U C Prop(A). 


Definition. A formal proof, or just proof, of p from % is a sequence pj,..., Dn 

with n > 1 and p, = p, such that for k = 1,...,n: 

(i) either p, € &, 

(ii) or pz is a propositional axiom, 

(iii) or there are i,j € {1,...,k —1} such that p;, can be inferred from p; and 
p; by MP. 

If there exists a proof of p from %, then we write & - p, and say & proves p. 

For © = § we also write - p instead of NF p. 


Lemma 2.2.1. | p— p. 


Proof. The proposition p > ((p +p) p) is a propositional axiom by axiom 
scheme 2. By axiom scheme 6, 


{p > ((p > p) > p)} > {(p > (p> p)) > (p> p)} 


is a propositional axiom. Applying MP to these two axioms yields 


t (p> (p> p)) > (pp). 


Since p — (p > p) is also a propositional axiom by scheme 2, we can apply MP 
again to obtain F p > p. 


The next result shows that our proof system is sound, to use a term that is often 
used in this connection. For the straightforward proof, use that propositional 
axioms are tautologies, and use part (4) of Lemma 2.1.3. 


Proposition 2.2.2. If Ut p, then UE p. 
The converse is true but less obvious. In other words: 


Theorem 2.2.3 (Completeness - first form). 


SEKp = VTEp 
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There is some arbitrariness in our choice of axioms and rule, and thus in our 
notion of formal proof. This is in contrast to the definition of F, which merely 
formalizes the underlying idea of propositional logic as stated in the introduction 
to the previous section. However, the equivalence of + and - (Completeness 
Theorem) means that our choice of axioms and rule gives a complete proof 
system. Moreover, this equivalence has consequences which can be stated in 
terms of —- alone. An example is the Compactness Theorem: 


Theorem 2.2.4 (Compactness of Propositional Logic). If 4 —- p, then there is 
a finite subset Xo of U such that Xo — p. 


It is convenient to prove first a variant of the Completeness Theorem. 


Definition. We say that © is inconsistent if 4 + L, and otherwise (that is, if 
X¥ L) we call © consistent. 


Theorem 2.2.5 (Completeness - second form). 
yu is consistent if and only if & has a model. 


From this second form of the Completenenes Theorem we obtain easily an al- 
ternative form of the Compactness of Propositional Logic: 


Corollary 2.2.6. 4 has a model <= > every finite subset of & has a model. 


We first show that the second form of the Completeness Theorem implies the 
first form. For this we need a lemma that will also be useful later in the course. 
It says that “>” behaves indeed as one might hope. 


Lemma 2.2.7 (Deduction Lemma). Suppose NU {p} q. Then Xt p— q. 


Proof. By induction on (formal) proofs from 5 U {p}. 

If g is a propositional axiom, then © + gq, and since q > (p > gq) isa 
propositional axiom, MP yields % + p> q. If gq € NU {p}, then either qe u 
in which case the same argument as before gives & + p > q, or q =p and then 
“+ pq since t p > p by the lemma above. 

Now assume that q is obtained by MP from r and r — q, where NU{p} Er 
and % U {p} + r > q and where we assume inductively that © + p > r and 
“+E p> (rq). Then we obtain © + p > q from the propositional axiom 


5 


(p > (r >. @)) > (p> 7) > (Pp Q)) 


by applying MP twice. 


Corollary 2.2.8. Ut p if and only if SU {7p} ts inconsistent. 


Proof. (=) Assume & + p. Since p + (=p + L) is a propositional axiom, we 
can apply MP twice to get XU {ap} L. Hence © U {=p} is inconsistent. 

(<=) Assume © U {-p} is inconsistent. Then © U {=p} + L, and so by the 
Deduction Lemma we have 1 + =p + L. Since (=p + L) > p is a propositional 
axiom, MP yields UF p. 
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We leave the proof of the next result as an exercise. 
Corollary 2.2.9. If 5 is consistent and St p, then %U {p} ts consistent. 


Corollary 2.2.10. The second form of Completeness (Theorem 2.2.5) implies 
the first form (Theorem 2.2.3). 


Proof. Assume the second form of Completeness holds, and that © —/ p. We 
want to show that then % - p. From © EF p it follows that © U {7p} has 
no model. Hence by the second form of Completeness, the set © U {=p} is 
inconsistent. Then by Corollary 2.2.8 we have UF p. 


Definition. We say that © is complete if % is consistent, and for each p either 
“UE por NE ap. 


Completeness as a property of a set of propositions should not be confused 
with the completeness of our proof system as expressed by the Completeness 
Theorem. (It is just a historical accident that we use the same word.) 

Below we use Zorn’s Lemma to show that any consistent set of propositions 
can be extended to a complete set of propositions. 


Lemma 2.2.11 (Lindenbaum). Suppose X is consistent. Then % Cb! for some 
complete S’ C Prop(A). 


Proof. Let P be the collection of all consistent subsets of Prop(A) that contain 
&. In particular % € P. We consider P as partially ordered by inclusion. Any 
totally ordered subcollection {X; : 7 € I} of P with I 4 @ has an upper bound 
in P, namely U{; : i € I}. (To see this it suffices to check that U{X; : i € I} 
is consistent. Suppose otherwise, that is, suppose U{h; : ie I} L. Since a 
proof can use only finitely many of the axioms in L){; : 7 € I}, there exists 
i € I such that 4; + L, contradicting the consistency of %;.) 

Thus by Zorn’s lemma P has a maximal element %’. We claim that then ©’ 
is complete. For any p, if &’ ¥ p, then by Corollary 2.2.8 the set 4’ U {=p} is 
consistent, hence =p € /’ by maximality of ©’, and thus ©’ + —p. 


Suppose A is countable. For this case we can give a proof of Lindenbaum’s 
Lemma without using Zorn’s Lemma as follows. 


Proof. Because A is countable, Prop(A) is countable. Take an enumeration 
(Pn)nen Of Prop(A). We construct an increasing sequence ©} = Ho C YC... 
of consistent subsets of Prop(A) as follows. Given a consistent UH, C Prop(A) 
we define 


Sn Uf{pn} if Sa pn, 
n+l = ‘ 
En U{>pn} if Un pn, 
so Min+1 remains consistent by Corollaries 2.2.8 and 2.2.9. Thus 


ye {= :neN} 


is consistent and also complete: for any n either pp € Unt1 © Noo Or APyn E 
yin+1 Sc doo 


2.2. COMPLETENESS FOR PROPOSITIONAL LOGIC 23 


Define the truth assignment ty : A > {0,1} by 
ty(a) = 1if UF a, and ty(a) = 0 otherwise. 
Lemma 2.2.12. Suppose & is complete. Then for each p we have 
“Ep = ty(p) =1. 
In particular, tz is a model of &. 


Proof. We proceed by induction on the length of p. If p is an atom or p = T 
or p = L, then the equivalence follows immediately from the definitions. It 
remains to consider the three cases below. 

Case 1: p = 7g, and (inductive assumption) H+ q¢ <> ty(q) =1. 
(=) Suppose © p. Then ty(p) = 1: Otherwise, ty(q) = 1, so UF q by the 
inductive assumption; since g > (p — 1) is a propositional axiom, we can apply 
MP twice to get & F L, which contradicts the consistency of ¥. 
(<) Suppose ty(p) = 1. Then ty(¢q) = 0, so © ¥ gq, and thus © + p by 
completeness of %. 

Case 2:p=qVr, UF q <=> ty(q)=1,and YFr <> ty(r) = 1. 
(=) Suppose that Ut p. Then ty(p) = 1: Otherwise, ty(p) = 0, so ty(q) = 0 
and ty(r) = 0, hence ©} ¥ gq and © ¥ r, and thus ©} + 7g and © + ar by 
completeness of ©; since ~qg + (=r — 7p) is a propositional axiom, we can apply 
MP twice to get © + 7p, which in view of the propositional axiom p + (=p > L) 
and MP yields % + L, which contradicts the consistency of U. 
(<) Suppose tz(p) = 1. Then ty(q) = 1 or ty(r) = 1. Hence UF gor UF r. 
Using MP and the propositional axioms q — p and r > p we obtain Ut p. 

Case 3: p=qAr, UF q = ty(q)=1and Ltr <> ty(r) =1. 
We leave this case as an exercise. 


We can now finish the proof of Completeness (second form): 
Suppose © is consistent. Then by Lindenbaum’s Lemma »® is a subset of a 
complete set ©’ of propositions. By the previous lemma, such a »’ has a model, 
and such a model is also a model of ©. 

The converse—if © has a model, then © is consistent—is left to the reader. 


Application to coloring infinite graphs. What follows is a standard use 
of compactness of propositional logic, one of many. Let (V,£) be a graph, by 
which we mean here that V is a set (of vertices) and E (the set of edges) is a 
binary relation on V that is irreflexive and symmetric, that is, for all v,w € V 
we have (v,v) ¢ E, and if (v,w) € E, then (w,v) € E. Let some n > 1 be 
given. Then an n-coloring of (V, E) is a function c: V > {1,...,n} such that 
c(v) # c(w) for all (v,w) € E: neighboring vertices should have different colors. 

Suppose for every finite Vo C V there is an n-coloring of (Vo, Eo), where 
Eo := EM (VW x Vo). We claim that there exists an n-coloring of (V, E). 


Proof. Take A := V x {1,...,n} as the set of atoms, and think of an atom (v, 7?) 
as representing the statement that v has color i. Thus for (V, £) to have an 
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n-coloring means that the following set & C Prop (A) has a model: 


Y:= {(v, 1) V---V (o,n): v EV}U {A((v, 4) A (v, J) : vEV,l<i<g<n} 
U {a((v, 4) A (w,i)) : (vw) € E,1 <i <n}. 


The assumption that all finite subgraphs of (V,E) are n-colorable yields that 
every finite subset of & has a model. Hence by compactness © has a model. 


Exercises. 
(1) Let (P,<) bea poset. Then there is a linear order <’ on P that extends <. (Hint: 
use the compactness theorem and the fact that this is true when P is finite.) 


(2) Suppose © C Prop(A) is such that for each truth assignment t : A — {0,1} there 
is p € © with t(p) = 1. Then there are pi,...,pn € 4 such that pi V---V pn isa 
tautology. (The interesting case is when A and » are infinite.) 


2.3. Languages and Structures 


Propositional Logic captures only one aspect of mathematical reasoning. We 
also need the capability to deal with predicates, variables, and the quantifiers 
“for all” and “there exists.” We now begin setting up a framework for Predicate 
Logic (or First-Order Logic, FOL), which has these additional features and has 
a claim on being a complete logic for mathematical reasoning. This claim will 
be formulated later in this chapter as the Completeness Theorem and proved in 
the next chapter. 


Definition. A language! L is a disjoint union of: 
(i) aset L* of relation symbols; each R € L" has associated arity a(R) € N; 
(ii) aset L‘ of function symbols; each F € L* has associated arity a(F) € N. 
An m-ary relation or function symbol is one that has arity m. Instead of “0- 
ary”, “l-ary”, “2-ary” we say “nullary”, “unary”, “binary”. A constant symbol 
is a function symbol of arity 0. 

In most cases the symbols of a language will be nullary, unary, or binary, 
but for good theoretical reasons we do not wish to exclude higher arities. 


Examples. 

(1) The language Lg, = {1,~',-} of groups has constant symbol 1, unary 
function symbol —!, and binary function symbol -. 

(2) The language Da, = {0,—,+} of (additive) abelian groups has constant 
symbol 0, unary function symbol —, and binary function symbol +. 

(3) The language Lo = {<} has just one binary relation symbol <. 

(4) The language Loan = {<,0,—,+} of ordered abelian groups. 

(5) The language Lpig = {0,1,+,-} of rigs (or semirings) has constant symbols 
0 and 1, and binary function symbols + and -. 


1What we call here a language is also known as a signature, or a vocabulary. 
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(6) The language Lring = {0,1,—,+,-} of rings. The symbols are those of the 
previous example, plus the unary function symbol —. 


From now on, let LZ denote a language. 


Definition. A structure A for L (or L-structure) is a triple 
(A; (RA) rex, (F“) pert) 


consisting of: 

(i) anonempty set A, the underlying set of A;? 

(ii) for each m-ary R € L’ a set R4 C A™ (an m-ary relation on A), the 
interpretation of R in A; 

(iii) for each n-ary F € L‘ an operation F4 : A" —> A (an n-ary operation on 
A), the interpretation of F in A. 


Remark. The interpretation of a constant symbol c of L is a function 
ed SA, 


Since A° has just one element, c+ is uniquely determined by its value at this 
element; we shall identify c4 with this value, so c4 € A. 


Given an L-structure A, the relations R4 on A (for R € L"), and operations FA 
on A (for F € L/) are called the primitives of A. When JA is clear from context 
we often omit the superscript A in denoting the interpretation of a symbol of L 
in A. The reader is supposed to keep in mind the distinction between symbols 
of Z and their interpretation in an L-structure, even if we use the same notation 
for both. 


Examples. 

(1) Each group is considered as an Leg,-structure by interpreting the symbols 
1, ~!, and - as the identity element of the group, its group inverse, and its 
group multiplication, respectively. 

(2) Let A= (A; 0,—,+) be an abelian group; here 0 € A is the zero element 
of the group, and —: A— Aand+: A? > A denote the group operations 
of A. We consider A as an Lap-structure by taking as interpretations of 
the symbols 0,— and + of Lap the group operations 0, — and + on A. 
(We took here the liberty of using the same notation for possibly entirely 
different things: + is an element of the set Lap, but also denotes in this 
context its interpretation as a binary operation on the set A. Similarly 
with 0 and —.) In fact, any set A in which we single out an element, a 
unary operation on A, and a binary operation on A, can be construed as 
an Lap-structure if we choose to do so. 

(3) (N; <) is an Lo-structure where we interpret < as the usual ordering 
relation on N. Similarly for (Z; <), (Q; <) and (R; <). (Here we take 


?It is also called the universe of A; we prefer less grandiose terminology. 
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even more notational liberties, by letting < denote five different things: a 
symbol of Zo, and the usual orderings of N, Z, Q, and R respectively.) 
Again, any nonempty set A equipped with a binary relation on it can be 
viewed as an Lo-structure. 

(4) (Z; <,0,—,+) and (Q; <,0,—,+) are both Loap-structures. 

(5) (N; 0,1,+,-) is an Drig-structure. 

(6) (Z; 0,1,—-,+,-) is an Dring-structure. 


Let 6 be an L-structure with underlying set B, and let A be a nonempty subset 
of B such that F(A") C A for every n and n-ary F € L‘. Then A is the 
underlying set of an L-structure A defined by letting 


FA := FB lan: A" > A, for n-ary F € Li, 
R4A:= R80 A™ for mary REL’. 


Definition. Such an L-structure A is said to be a substructure of 6, notation: 
A C B. We also say in this case that B is an extension of A, or extends A. 


Examples. 
(1) (Z; 0,1,—,+4,-) Cc (Q; Ole ee) c (R; O51; —)"Fy:) 
(2) (N; Ophea) c (Z; <,0,1,+ -) 


Definition. Let A = (A;...) and B = (B;...) be L-structures. 
A homomorphism h: A> B isamap h: A - B such that 


(i) for each m-ary R € L" and each (a1,...,a@m) € A” we have 
(a1,-.-,4m) € RA = > (hay,...,ham) € R®; 
(ii) for each n-ary F € L‘ and each (a1,...,a,) € A” we have 


h(FA(a1,...,4n)) = F?(hay,..., han). 


Replacing => in (i) by ==> yields the notion of a strong homomorphism. An 
embedding is an injective strong homomorphism; an isomorphism is a bijective 
strong homomorphism. An automorphism of A is an isomorphism A > A. 

If A C B, then the inclusion a> a: A — B is an embedding A —> B. 
Conversely, a homomorphism h : A — 6 yields a substructure h(A) of B with 
underlying set h(A), and if h is an embedding we have an isomorphism 


ary h(a): A> h(A). 


If «: AB and j:8—-C are homomorphisms (strong homomorphisms, 
embeddings, isomorphisms, respectively), then so is 707: A—C. The identity 
map ly, on A is an automorphism of A. If z:.A— B is an isomorphism then so 
is the map i~!: B > A. Thus the automorphisms of A form a group Aut(A) 
under composition with identity 14. 


Examples. 
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1. Let A= (Z; 0,—,+). Then k++ —k is an automorphism of A. 


2. Let A = (Z; <). The map k + & +1 is an automorphism of A with 
inverse given by k++ k—1. 


If A and 6 are groups (viewed as structures for the language Lc,), then a 
homomorphism h : A > BG is exactly what in algebra is called a homomorphism 
from the group A to the group B. Likewise with rings, and other kinds of 
algebraic structures. 


A congruence on the L-structure A is an equivalence relation ~ on its underlying 
set A such that 
(i) if RE L’ is m-ary and a, ~ b1,...,dm ~ bm, then 


(Q1,---5Qm) € RA <> (b1,...,bm) € BA; 
(ii) if F € L* is n-ary and a, ~ by,...,an ~ bn, then 
PA ay oc dg) EA (By 52 6 ba): 


Note that a strong homomorphism h : A > B yields a congruence ~, on A as 
follows: for a,,a2 € A we put 


ay ~pn ag => h(ay) = h(ag). 


Given a congruence ~ on the L-structure A we obtain an L-structure A/~ (the 
quotient of A by ~) as follows: 
(i) the underlying set of A/~ is the quotient set A/~; 
(ii) the interpretation of an m-ary R € L* in A/~ is the m-ary relation 
tachi. \ Eat de evOen Ve} 

on A/x; 

(iii) the interpretation of an n-ary F € L' in A/ ~ is the n-ary operation 
(G7 i he PG 
on A/~. 


Note that then we have a strong homomorphism a> a~ : A> A/~. 


Products. To combine many structures into a single we form products. Let 
(Bi),.; be a family of L-structures, B; = (B;;...) for i € I. The product 


115: 
i€l 
is defined to be the L-structure B whose underlying set is the product set 


I],<; Bi, and where the basic relations and functions are defined coordinate- 
wise: for m-ary R € L* and elements b; = (b1;),---,0m = (bmi) € [lier Bis 


(b1,...,;0m) € B® <=> (bu,..., bmi) € R®™ for all ie I, 
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and for n-ary F € L* and b; = (b1;),..-,6n = (bni) € [lier Bis 


FR (by, bes 30m) = (FF (bu, or ails) nag 


For 7 € I the projection map to the jth factor is the homomorphism 


][4 — B;, (bj) + by. 


wel 


Using products we can combine several homomorphisms with a common domain 
into a single one: if for each i € I we have a homomorphism h; : A > 6; we 
obtain a homomorphism 


Exercises. For (1) below, recall that a normal subgroup of a group G is a subgroup 
N of G such that aza~' € N for alla€ Ganda eé N. 


(1) Let G be a group viewed as a structure for the language of groups. Each normal 
subgroup N of G yields a congruence =n on G by 


a=nb <=> aN=DbDN, 
and each congruence on G equals =n for a unique normal subgroup N of G. 


(2) Consider a strong homomorphism h : A > B of L-structures. Then we have an 
isomorphism from A/~; onto h(A) given by a~* +> h(a). 


2.4 Variables and Terms 


Throughout this course 
Var = {vo, V1, V2, a zt 


is a countably infinite set of symbols whose elements will be called variables; 
we assume that v;, #4 Vn for m 4 n, and that no variable is a function or 
relation symbol in any language. We let x,y,z (sometimes with subscripts or 
superscripts) denote variables, unless indicated otherwise. 


Remark. Chapters 2-4 go through if we take as our set Var of variables any 
infinite (possibly uncountable) set; in model theory this can even be convenient. 
For this more general Var we still insist that no variable is a function or relation 
symbol in any language. In the few cases in chapters 2—4 that this more general 
set-up requires changes in proofs, this will be pointed out. 

The results in Chapter 5 on undecidability presuppose a numbering of the 
variables; our Var = {vo, Vi, V2,--. } comes equipped with such a numbering. 


Definition. An L-term is a word on the alphabet L‘ U Var obtained as follows: 
(i) each variable (viewed as a word of length 1) is an L-term; 
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(ii) whenever F € L! is n-ary and t,...,t, are L-terms, then the concatena- 
tion Ft, ...t, is an L-term. 


Note: constant symbols of LZ are L-terms of length 1, by clause (ii) for n = 0. 
The L-terms are the admissible words on the alphabet L' U Var where each 
variable has arity 0. Thus “unique readability” is available. 

We often write t(a1,...,2,) to indicate an L-term ¢ in which no variables 
other than 71,...,2%, occur. Whenever we use this notation we assume tacitly 
that 71,...,@, are distinct. Note that we do not require that each of 1,...,%n, 
actually occurs in ¢(41,...,%p). (This is like indicating a polynomial in the 
indeterminates 71,..., 2, by p(a1,...,2n), where one allows that some of these 
indeterminates do not actually occur in the polynomial p.) 


If a term is written as an admissible word, then it may be hard to see how 
it is built up from subterms. In practice we shall therefore use parentheses 
and brackets in denoting terms, and avoid prefix notation if tradition dictates 
otherwise. 


Example. The word -+%—yz is an Lping-term. For easier reading we indicate 
this term instead by (a + (—y))- z or even (x — y)z. 


Definition. Let A be an L-structure and ¢ = t(#) be an L-term where ¢ = 
r(t,z 


( 


(a1,...,2m). Then we associate to the ordered pair (t, Z) a function 4: A™ > A 
as follows 
(i) Ift is the variable x;, then t4(a) = a; for a = (a1,...,am) € A™. 


(ii) Ift = Ft,...t, where F € L‘ is n-ary and t,...,t, are L-terms, then 
tA(a)=FAGA Gua. 6a) for ae A™, 

This inductive definition is justified by unique readability. Note that if B is a 

second L-structure and A C B, then t4(a) = ¢8(a) for t as above and a € A™. 


Example. Consider R as a ring in the usual way, and let t(z, y, z) be the Lring- 
term (x—y)z. Then the function t® : R3 > R is given by t® (a,b,c) = (a—b)c. 


A term is said to be variable-free if no variables occur in it. Let t be a variable- 
free L-term and A an L-structure. Then the above gives a nullary function 
t4 : A° — A, identified as usual with its value at the unique element of A°, so 
t4 € A. In other words, if t is a constant symbol c, then t4 = c4 € A, where 
cA“ is as in the previous section, and if t = Ft,...tn with n-ary F € L* and 
variable-free L-terms t1,...,tn, then t4 = FA(t4,..., tA). 


Definition. Let t be an L-term, let 71,...,2, be distinct variables, and let 
T1,+-+,T be L-terms. Then t(7/21,...,7/2n) is the word obtained by re- 
placing all occurrences of x; in t by 7, simultaneously for i = 1,...,n. If t is 
given in the form t(a1,...,2,), then we write t(7,...,7) as a shorthand for 
t(™/%1,---,T/£n)- 


The easy proof of the next lemma is left to the reader. 


30 CHAPTER 2. BASIC CONCEPTS OF LOGIC 


Lemma 2.4.1. Suppose t is an L-term, %1,...,%, are distinct variables, and 
T1,+++5T are L-terms. Then t(™/21,...,T/@n) is an L-term. If T,...,7, are 
variable-free and t = t(a1,...,%n), then t(™,...,T) is variable-free. 


We urge the reader to do exercise (1) below and thus acquire the confidence that 
these formal term substitutions do correspond to actual function substitutions. 
In the definition of t(71/21,...,;T/%n) the “replacing” should be simultaneous, 
because it can happen that for ¢! := t(7/a1) we have t'(T2/x2) A t(™/21, T2/X2). 
(Here t, 7,72 are L-terms and 21, £2 are distinct variables.) 


Generators. Let 6 be an L-structure, let G C B, and assume also that L has 
a constant symbol or that G 4 0. Then the set 
{#8 (g1,..-59m): t(21,...,%m) is an L-term and g),...,9m €G} C B 


is the underlying set of some A C B, and this A is clearly a substructure of any 
A’ C B with GC A’. We call this A the substructure of B generated by G; if 
A = B, then we say that B is generated by G. If (a;)ie7 is a family of elements of 
B, then “generated by (a;)” means “generated by G” where G = {a;: 1 € I}. 


Exercises. 

(1) Let t(a1,...,%m) and 71(y1,.--,Yn),--+;Tm(Y1,---;Yn) be L-terms. Then the L- 
term t*(y1,.--,Yn) = t(71(y1,---;Yn);-++>Tm(Y1;---;Yn)) has the property that 
if A is an L-structure and a = (a1,...,@n) € A”, then 


(94a) S17. G7 (e+ 57 @)). 
(2) For every Lap-term t(x1,...,2n) there are integers ki,...,kn such that for every 
abelian group A = (A; 0,—,+), 
tA(a1, .2+30n) =kiait+-:++knan, for all (ai,...,an) € A”. 


Conversely, for any integers ki,...,kn there is an Lap-term t(11,...,2n) such 
that in every abelian group A = (A; 0,—,+) the above displayed identity holds. 


(3) For every Lring-term t(#1,...,2%,) there is a polynomial 
P(a1,..-,0n) € Zlai,...,¢n] 
such that for every commutative ring R = (R; 0,1,—-,+4,°:), 
t®(ri,...,tr) = P(ri,...,7n), for all (r1,..., rn) € R”. 


Conversely, for any polynomial P(a1,...,2n) € Z[xi,...,@n] there is an Lring- 
term t(2%1,...,%n) such that in every commutative ring R = (R; 0,1,—,+4+,-) the 
above displayed identity holds. 


(4) Let A and B be L-structures, h : A > B a homomorphism, and t = t(#1,...,2n) 
an L-term. Then 


h(t4(a1,..-,@n)) =t?(hai,...,han), for all (a1,...,an) € A”. 


(If A C Bandh: A — Bis the inclusion, this gives t+ (a1,...,an) = t8(a1,..., an) 
for all (a1,...,@n) € A”.) 
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(5) Consider the L-structure V = (N; 0,1,+,-) where L = Lrig. 
(a) Is there an L-term t(x) such that t’ (0) = 1 and t’ (1) = 0? 
(b) Is there an L-term t(x) such that t’(n) = 2” for all n € N? 
(c) Find all the substructures of NV. 


2.5 Formulas and Sentences 
Besides variables we also introduce the eight distinct logical symbols 


T all a V A = Vv 


The first five of these we already met when discussing propositional logic. None 
of these eight symbols is a variable, or a function or relation symbol of any 
language. Below L denotes a language. To distinguish the logical symbols from 
those in L, the latter are often referred to as the non-logical symbols. 


Definition. The atomic L-formulas are the following words on the alphabet 
LUVarU{T,1, =}: 

(i) T and L, 

(ii) Rty...tm, where R € L’ is m-ary and t1,...,tm are L-terms, 

(iii) = tite, where ¢; and ty are L-terms. 


The L-formulas are the words on the larger alphabet 


LU Var U {T, L,7,V,A,=, 2, ¥} 


obtained as follows: 

(i) every atomic L-formula is an L-formula; 

(ii) if y,w are L-formulas, then so are =y, Vyw, and Agu; 

(iii) if y is a L-formula and z is a variable, then Jay and Vry are L-formulas. 


Note that all D-formulas are admissible words on the alphabet 


LU Var U {T, 1,7, V,A,=, 5, V}, 


where =, J and V are given arity 2 and the other symbols have the arities 
assigned to them earlier. This fact makes the results on unique readability 
applicable to L-formulas. (However, not all admissible words on this alphabet 
are L-formulas: the word Jxx is admissible but not an L-formula.) 

The notational conventions introduced in the section on propositional logic 
go through, with the role of propositions there taken over by formulas here. (For 
example, given [-formulas y and w we shall write y V w to indicate Vp, and 
y — w to indicate ~pyV wv.) Here is a notational convention specific to predicate 
logic: given distinct variables 7,,...,2%,, and an L-formula y we let dz1...2ny 
and V2,...%ny abbreviate 4a ,...4d¢,y and Vr,...V@my, respectively. Thus 
if x,y, z are distinct variables, then dryz y stands for drdydz y. 

The reader should distinguish between different ways of using the symbol =. 
Sometimes it denotes one of the eight formal logical symbols, but we also use it 
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to indicate equality of mathematical objects in the way we have done already 
many times. The context should always make it clear what our intention is in 
this respect without having to spell it out. To increase readability we usually 
write an atomic formula = t;t2 as t; = tg and its negation 7 = tytg as ty F te, 
where t1, t2 are L-terms. The logical symbol = is treated just as a binary relation 
symbol, but its interpretation in a structure will always be the equality relation 
on its underlying set. This will become clear later. 


Definition. Let y be a formula of LZ. Written as a word on the alphabet above 
we have y = $1...8m. A subformula of y is a subword of the form s;... 8, 
where 1 <i < k <™m which also happens to be a formula of L. 

An occurrence of a variable x in y at the j-th place (that is, s; = a) is said 
to be a bound occurrence if y has a subformula s;5;41...5, with 7 <j < k that 
is of the form daw or Vaw. If an occurrence is not bound, then it is said to be 
a free occurrence. 


At this point the reader is invited to do the first exercise at the end of this 
section, which gives another useful characterization of subformulas. 


Example. In the formula (sa(x a y)) Ax =0, where x and y are distinct, the 
first two occurrences of x are bound, the third is free, and the only occurrence 
of y is free. (Note: the formula is actually the string Adz = xy = x0, and the 
occurrences of x and y are really the occurrences in this string.) 


Definition. A sentence is a formula in which all occurrences of variables are 
bound occurrences. 


We let y(a1,...,%n) indicate a formula y such that all variables that occur 
free in y are among 21,...,%p. In using this notation it is understood that 
1,---,€p are distinct variables, but it is not required that each of x1,...,%n 
occurs free in y. (This is analogous to indicating a polynomial equation in the 
indeterminates %1,...,U%p by p(@1,...,2%n) = 0, where one allows that some of 
these indeterminates do not actually occur in p.) 


Definition. Let y be an L-formula, let 71,...,2, be distinct variables, and 
let t1,...,tn be L-terms. Then y(t1/a1,...,tn/an) is the word obtained by 
replacing all the free occurrences of x; in y by t;, simultaneously fori =1,...,n. 
If y is given in the form y(21,...,2n), then we write y(ti,...,tn) as a shorthand 
for y(t) /@1,.--,tn/@n). 


We have the following lemma whose routine proof is left to the reader. 


Lemma 2.5.1. Suppose y is an L-formula, ©1,...,%n are distinct variables, 
and t1,...,t, are L-terms. Then p(t1/21,...,tn/tn) ts an L-formula. If 
ty,...,tn are variable-free and yp = y(a1,...,%n), then v(ti,...,tn) is an L- 
sentence. 


In the definition of y(t1/71,...,tn/an) the “replacing” should be simultaneous, 
because it can happen that y(ti/21)(te/xv2) 4 v(ti/21, te/x2). 
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Let A be an L-structure with underlying set A, and let C C A. We extend L to 
a language Lc by adding a constant symbol c for each c € C, called the name 
of c. These names are symbols not in L. We make A into an Lc-structure 
by keeping the same underlying set and interpretations of symbols of LD, and 
by interpreting each name c as the element c € C. The Le-structure thus 
obtained is indicated by Ac. Hence for each variable-free Lo-term t we have 
a corresponding element t4¢ of A, which for simplicity of notation we denote 
instead by t+. All this applies in particular to the case C = A, where in L4 we 
have a name a for each a € A. 


Definition. We can now define what it means for an L,4-sentence o to be true 

in the L-structure A (notation: A — a, also read as A satisfies o or o holds in 

A, or o is valid in A). First we consider atomic L4-sentences: 

(Gi) AET,and AF 1; 

(ii) AE Rt,...tm if and only if (t#,...,t4) € R4, for m-ary R € L’, and 
variable-free L4-terms t1,...,tm; 

(iii) AE t, = te if and only if tj = ¢¢!, for variable-free L4-terms ty, to. 


We extend the definition inductively to arbitrary D4-sentences as follows: 
(i) o =-70,: then Ao if and only if AF on. 
(ii) o=01 Vo: then Ao if and only if AE ao, or AE a2. 
(iii) o = 01 Aog: then Ao if and only if AF a; and AE on. 
( 
( 


iv) o =Jdzry(zx): then A o if and only if A — y(a) for some a € A. 
v) go =Vep(x): then AE a if and only if A — ¢(a) for alla € A. 


Even if we just want to define A — o for L-sentences o, one can see that if 
o has the form dxry(x) or Vay(x), the inductive definition above forces us to 
consider L 4-sentences y(a). This is why we introduced names. We didn’t say so 
explicitly, but “inductive” refers here to induction with respect to the number of 
logical symbols in ao. For example, the fact that y(a) has fewer logical symbols 
than Jxzy(a) is crucial for the above to count as a definition. Also unique 
readability is involved: without it we would not allow clauses (ii) and (iii) as 
part of our inductive definition. 


It is easy to check that for an L4-sentence 0 = 4a1...¢np(@1,.--, Ln), 
AkEo — AE v(a,...,a,) for some (a1,...,an) € A”, 
and that for an L4-sentence 0 = Va ,...2np(@1,.--,2n), 


AEo AE 9(a,,.--,@,,) for all (a),...,a,) € A”. 


Definition. Given an L4-formula y(21,...,%n) we let y4 be the following 
subset of A”: 

pA = {(a1,.--,4n): AF p(a,---,Gn)} 
The formula y(21,...,2n) is said to define the set p4 in A. A set S C A” 
is said to be definable in A if S = y4 for some L4-formula y(a1,...,%n). If 
moreover y can be chosen to be an L-formula, then S is said to be 0-definable 


in A. 
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Examples. 

(1) The set {r € R: r < V2} is 0-definable in (R; <,0,1,+,-—,-): it is defined 
by the formula (2? < 1+1)V (a <0). (Here x? abbreviates the term x- 2.) 

(2) The set {r € R:r < 7} is definable in (R; <,0,1,+,—,-): it is defined by 
the formula x < a. 


To show that a set X C A is not 0-definable in A, one can sometimes use 
automorphisms of A; see the exercises below. We call a map f : X > A” with 
X C A™ definable in A if its graph as a subset of A”*” is definable in A; note 
that then its domain X is definable in A. 


We now single out formulas by certain syntactical conditions. These conditions 
have semantic counterparts in terms of the behaviour of these formulas under 
various kinds of homomorphisms, as shown in some exercises below. (These 
exercises also show that isomorphic L-structures satisfy exactly the same L- 
sentences.) 

An L-formula is said to be quantifier-free if it has no occurrences of 4 and 
no occurrences of V. An L-formula is said to be existential if it has the form 


da ,...@myp with distinct x1,...,%m and a quantifier-free Z-formula y. An 
L-formula is said to be universal if it has the form Va1...2% my with distinct 
1,-.-,;2%m and a quantifier-free L-formula y. An L-formula is said to be positive 


if it has no occurrences of — (but it can have occurrences of L). 


Exercises. 
(1) Let y and w be L-formulas; put sf(y) := set of subformulas of ¢. 
(a) If y is atomic, then sf(y) = {yp}. 
(b) sf(-~) = {79} U sf(y). 
(c) spVv) ={eV vo} Usft(y) Ust(y), and stp Av) = {eA v} Usf(y) Ust(y). 
(d) sf(dry) = {Arp} Usf(y), and sf(Vry) = {Vxp} Usf(y). 
(2) Let y and w be L-formulas, x,y variables, and t an L-term. 
(a) (“9)(t/x) = >(e(t/2)). 
(b) (eV ¥)(t/x) = v(t/x) V v(t/x), and (p A b)(t/x) = plt/x) A (t/a). 


(c) (Ayp)(t/x) = Sy(y(t/xz)) if x and y are different, and (Ayp)(t/x) = Jyp if 
x and y are the same; likewise with Vyy. 


(3) If t(ai,...,a@n) is an La-term and ai,...,an € A, then 


Came dace ake (Pee 


(4) Suppose that S; C A” and Sz C A” are defined in A by the L,4-formulas 
pi(@1,-.-,;%n) and yo(x1,...,%n) respectively. Then: 
(a) Sy U So is defined in A by (v1 V v2) (a1,...,2n). 

(b) $1 Sp is defined in A by (v1 A 2) (@1,..-,2n). 

c) 


(c) A” \ Sj is defined in A by 7y1(#1,...,2n). 
(d) Si & So —A B Veiacctn(y > 2). 


(5) Let 7: A™*" + A™ be the projection map given by 


T(1,.--;@m+n) = (@1,..-,Am), 
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and for S C A™*™ and a € A™, put 
S(a) := {b € A”: (a,b) € S} (a section of S). 
Suppose that S C A™*” is defined in A by the L4-formula y(x,y) where x = 


(@1,...,%m) and y = (y1,.--, Yn). Then Jy... yny(x, y) defines in A the subset 
m(S) of A™, and Vy1... yny(x, y) defines in A the set 


{ae A™: S(a) = A”}. 


The following sets are 0-definable in the corresponding structures: 

(a) The ordering relation {(m,n) € N? :m <n} in (N; 0,+). 

b) The set {2,3,5,7,...} of prime numbers in the semiring VV = (N; 0,1,+4,-). 

c) The set {2” :n € N} in the semiring NV. 

d) The set {a € R: f is continuous at a} in (R;<, f) where f : R > R is any 
function. 


Let the symbols of L be a binary relation symbol < and a unary relation symbol 
U. Then there is an L-sentence o such that for all X C R we have 


(R;<,X) =o <> X is finite. 


Let A C B. Then we consider L4 to be a sublanguage of Dg in such a way 
that each a € A has the same name in Ly, as in Lg. This convention is in force 
throughout these notes. 

(a) For each variable free L-term t we have t+ = 28. 

(b) Ifthe L4-sentence o is quantifier-free, then AE o @ BE o. 
(c) Ifo is an existential L.4-sentence, then AE o > BEo 
(d) Ifo is a universal L4-sentence, then BE o > AE o. 


Suppose h : A —> B is a homomorphism of L-structures. For each L4-term t, 
let t, be the Lg-term obtained from t by replacing each occurrence of a name 
a of an element a € A by the name ha of the corresponding element ha € B. 
Similarly, for each L4-formula y, let y;, be the Lg-formula obtained from y by 
replacing each occurrence of a name a of an element a € A by the name ha of 
the corresponding element ha € B. Note that if y is a sentence, so is y,. Then: 
(a) if ¢ is a variable-free L4-term, then h(t*) = 8; 

(b) if o is a positive L4-sentence without V-symbol, then A —- o > BE on; 
(c) if o is a positive L4-sentence and h is surjective, then AF o > BE on; 
(d) if o is an L4-sentence and h is an isomorphism, then AF ¢ © B EF on; 


(10) If f is an automorphism of A and X C A is 0-definable in A, then f(X) = X. 


2.6 Models 


In the rest of this chapter L is a language, A is an L-structure (with underlying 
set A), and, unless indicated otherwise, t is an L-term, y, 7, and @ are L- 
formulas, o is an L-sentence, and © is a set of Z-sentences. We drop the prefix 
Lin “L-term” and “Z-formula” and so on, unless this would cause confusion. 
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Definition. We say that A is a model of & or & holds in A (denoted A |= ¥) 
if A Eo for each a € B. 


To discuss examples it is convenient to introduce some notation. Suppose DL 
contains (at least) the constant symbol 0 and the binary function symbol +. 
Given any terms t),...,tn we define the term t; +---+t, inductively as follows: 
it is the term 0 if n = 0, the term ¢ if n = 1, and the term (t) +---+tn_-1)+tn 
for n > 1. We write nt for the term t+---+t with n summands, in particular, Ot 
and 1¢ denote the terms 0 and t respectively. Suppose LE contains the constant 
symbol 1 and the binary function symbol - (the multiplication sign). Then we 
have similar notational conventions for t,-...-t, and t”; in particular, for n = 0 
both stand for the term 1, and t! is just t. 


Examples. Fix three distinct variables x, y, z. 
(1) Totally ordered sets are the Lo-structures that are models of 


{Va(a £ x), Vayz(( <yAy<z) 324 <2z), Vay(x<yVa=yVy <2z)}. 
(2) Groups are the Lg,-structures that are models of 
Gr := {Va(z-l=2Al1-2=2),Va(x-27'=1An7'-2=1), 
Vay2((w-y)-2=a-(y-z))} 


(3) Abelian groups are the Lay-structures that are models of 


Ab := {Va(a+0 = 2),Va(a+ (—2) = 0), Vey(a+y=yt+2), 
Vayz((w+y) +2=a"+ (y+ 2))} 


(4) Torsion-free abelian groups are the La,-structures that are models of 
AbU {Va(nz =0 > 4 =0) : n=1,2,3,...} 


(5) Rings are the Lring-structures that are models of 


Ring := AbU {Vayz ((2@-y)-z=2-(y-2)),V2(z-l=2Al-c=z), 
Vayz((a@-(yt+2z)=a-y+u-2z2A(at+y)-z=2-2+y-2))} 


(6) Fields are the Lring-structures that are models of 


Fl = Ring U {VaVy(x-y =y-2),140,Va(a 40 > Ay(x@-y=1))} 
(7) Fields of characteristic 0 are the Lping-structures that are models of 
F(0) :-= FlU{n1 40 : n=2,3,5,7,11,...} 


(8) Given a prime number p, fields of characteristic p are the Lping-structures 
that are models of Fl(p) := F1U {pl = 0}. 
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(9) Algebraically closed fields are the Lping-structures that are models of 


ACF := FIU{Vu...upde(2" + ue") +--+» +u, = 0) : n= 2,3,4,5,...} 


Here uz, U2, u3,... is some fixed infinite sequence of distinct variables, dis- 
tinct also from x, and u;2”~* abbreviates u;-a2"~*, fori =1,...,n. 

(10) Algebraically closed fields of characteristic 0 are the Lring-structures that 
are models of ACF(0) := ACF U {nl 40 : n = 2,3,5,7,11,...}. 

(11) Given a prime number p, algebraically closed fields of characteristic p are 
the Lring-structures that are models of ACF(p) := ACF U {pl = 0}. 


In Example (1) our use of the symbol < rather than < indicates that we take the 
strict version of a total order, as the sentences mentioned in (1) specify. This is 
a minor difference with how we defined totally ordered sets in Section 2.3, using 
the nonstrict version of an ordering, with < as the primitive notion. Another 
minor difference is that in Section 2.3 we allowed the underlying set of a poset 
to be empty, but in (1) the underlying set of a totally ordered set is nonempty, 
since that is a general requirement for the structures considered in these notes. 


Definition. We say that o is a logical consequence of % (written © - a) if o 
is true in every model of &. 


Example. It is well-known that in any ring R we have a-0 = 0 for alla c€ R. 
This can now be expressed as Ring - Va(a-0 = 0). 


We defined what it means for a sentence o to hold in a given structure A. We 
now extend this to arbitrary formulas. 

First define an A-instance of a formula y = (a1,...,%m) to be an Ly- 
sentence of the form y(a,,...,@,,) With a1,...,@m € A. Of course y can also 
be written as y(y1,---;Yn) for another sequence of variables y1,...,Yn, for ex- 
ample, y1,...,Yn could be obtained by permuting 71,...,2@m, or it could be 
Z1,---,;Lm;Lm+41, obtained by adding a variable z,,41,. Thus for the above to 
count as a definition of “A-instance,” the reader should check that these differ- 
ent ways of specifying variables (including at least the variables occurring free 
in y) give the same A-instances. 


Definition. A formula ¢ is said to be valid in A (notation: A | y) if all its 
A-instances are true in A. 


The reader should check that if y = y(#1,...,2%m), then 


AEFo — AF V21...Vtmy. 


We also extend the notion of “logical consequence of 4” to formulas (but 
continues to be a set of sentences). 


Definition. We say that y is a logical consequence of & (notation: & — y) if 
Al ¢ for all models A of ©. 
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One should not confuse the notion of “logical consequence of &” with that of 
“provable from U.” We shall give a definition of provable from & in the next 
section. The two notions will turn out to be equivalent, but that is hardly 
obvious from their definitions: we shall need much of the next chapter to prove 
this equivalence, which is called the Completeness Theorem for Predicate Logic. 
We finish this section with two basic facts: 


Lemma 2.6.1. Let a(x1,...,%m) be an La-term, and recall that a defines a 

map a4: A™ + A. Let ti,...,tm be variable-free La-terms, with t4 =a; € A 

fori=1,...,m. Then a(ti,...,tm) ts a variable-free La-term, and 
A(tiy.s<jtm)y = a(ay,2./a,)4 = oA(tt...,t4). 


This follows by a straightforward induction on a. 


Lemma 2.6.2. Let t1,...,tm be variable-free La-terms with a =a,cCA 
fori =1,...,m. Let p(a1,...,%m) be an La-formula. Then the L4-formula 
plti,.--,tm) 18 a sentence and 


AE o(ti,.--;tm) — = AE ¢(a,---,;Qm)- 


Proof. To keep notations simple we give the proof only for m = 1 with t = t, 
and x = x1. We proceed by induction on the number of logical symbols in v(x). 

Suppose that y is atomic. The case where y is T or L is obvious. Assume y 
is Ray... @m where R € L* is m-ary and a (x),... ,Q@m(x) are L4-terms. Then 
p(t) = Raj(t)...am(t) and y(a) = Rai(a)...am(a). We have A — y(t) iff 
(az(t)4,... ,&m(t)4) € R4 and also A (a) iff (a,(a)4,... ,&m(a)4) € RA. 
As a;(t)4 = a;(a)4 for all i by the previous lemma, we have A - g(t) iff 
AE (a). The case that v(x) is a(x) = B(x) is handled the same way. 

It is also clear that the desired property is inherited by disjunctions, con- 
junctions and negations of formulas y(x) that have the property. Suppose now 
that p(x) = dy d. 

Case y # x: Then y = Y(z,y), v(t) = dyb(t,y) and y(a) = Jyv(a,y). As 
p(t) = dyw(t,y), we have A — y(t) iff A — v(t,b) for some b € A. By the 
inductive hypothesis the latter is equivalent to A - w(a,b) for some b € A, 
hence equivalent to A - dyvd(a,y). As y(a) = Ayw(a,y), we conclude that 
AE ¢(t) iff AE ¢(a). 

Case y = x: Then x does not occur free in v(x) = day. So y(t) = y(a) = 
is an L-sentence, and AE y(t) @ AE ¢(a) is obvious. 

When v(x) = Vyw then one can proceed exactly as above by distinguishing 
two cases. 


2.7 Logical Axioms and Rules; Formal Proofs 


In this section we introduce a proof system for predicate logic and state its 
completeness. We then derive as a consequence the compactness theorem and 
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some of its corollaries. The completeness is proved in the next chapter. We 
remind the reader of the notational conventions at the beginning of Section 2.6. 


A propositional axiom of L is by definition a formula that for some y, w, 8 occurs 
in the list below: 


‘leas 

2.9>(~Vy); vrVy) 
3. ay > (rp > “(pV ¥)) 

4. (pA) > ¢; (~vAp)ry¥ 


5. p> (b> (vA) 
6. (p> (PW 3 8)) > (P34) > (> 9) 
7. p> (-y > 1) 


8. (apo l)oy 


Each of items 2-8 is a scheme describing infinitely many axioms. Note that 
this list is the same as the list in Section 2.2 except that instead of propositions 
Pp, q,r we have formulas y, v, 0. 

The logical axioms of L are the propositional axioms of L and the equality 
and quantifier axioms of L as defined below. 


Definition. The equality axioms of L are the following formulas: 

(i) w=a, 

(ii) c=yry=z, 

(iii) (w=yAy=z) 9 @=2z, 

(iv) (@1 =y1A...A&m = Ym A Ra1...Lm) > Ry... Ym, 

(v) (a1 =y1A...Aan = Yn) 9 Fa1...%y, = Fy... Yn, 

with the following restrictions on the variables and symbols of L: x,y,z are 
distinct in (ii) and (iii); in (iv), 71,...,@%m, Y1,---;Ym are distinct and Re I* 
is m-ary; in (v), %1,---,2n; Y1,-++5Yn are distinct, and F € L‘ is n-ary. Note 
that (i) represents an axiom scheme rather than a single axiom, since different 
variables x give different formulas x = x. Likewise with (ii)—(v). 


Let x and y be distinct variables, and let y(y) be the formula da(«¢ ¥ y). 
Then y(y) is valid in all A with |A| > 1, but y(a/y) is invalid in all A. Thus 
substituting «2 for the free occurrences of y does not always preserve validity. To 
get rid of this anomaly, we introduce the following restriction on substitutions 
of a term t for free occurrences of y. 


Definition. We say that t is free for y in y, if no variable in t can become bound 
upon replacing the free occurrences of y in y by t, more precisely: whenever x 
is a variable in t, then there are no occurrences of subformulas in y of the form 
dey or Vew that contain an occurrence of y that is free in y. 
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Note that if ¢ is variable-free, then ¢ is free for y in y. We remark that “free 
for” abbreviates “free to be substituted for.” In exercise 3 the reader is asked to 
show that, with this restriction, substitution of a term for the free occurrences 
of a variable does preserve validity. 


Definition. The quantifier axioms of L are the formulas y(t/y) > dyy and 
Vyp > v(t/y) where t is free for y in y. 


These axioms have been chosen to have the following property. 
Proposition 2.7.1. The logical axioms of L are valid in every L-structure. 


We first prove this for the propositional axioms of L. Let aj,...,@,, be distinct 
propositional atoms not in L. Let p = p(aj,...,Q@n) € Prop{ai,...,an}. Let 
1;---,4n be formulas and let p(yi,...,¢n) be the word obtained by replac- 
ing each occurrence of a; in p by y; for i = 1,...,n. One checks easily that 
P(Y1,---;Pn) is a formula. 


Lemma 2.7.2. Suppose pj = 9;(@1,..-,%m) for1 <i<n and let aj,...,am € 
A. Define a truth assignment t : {a1,...,Q@n} —> {0,1} by t(a;) = 1 iff 
AE 9i(@y,---,Qm)- Then p(yi,-.--, Yn) is an L-formula and 


D Big vt. On as Ty. ey Oy Bin)! = PO Gis hg oe a Gy Gy) )s 
t(p(ai,.--,Q@n)) =1 = > AEp(y1(a1,---;Qm),-++> Pn (@y,+-+5Qm))- 


In particular, if p is a tautology, then AE p(¢1,---;Yn)- 


Proof. Easy induction on p. We leave the details to the reader. 


Definition. An L-tautology is a formula of the form p(%1,...,¢n) for some 
tautology p(ai,...,Qn) € Prop{aj,...,@,} and some formulas ¥1,..., Qn. 


By Lemma 2.7.2 all L-tautologies are valid in all L-structures. The propositional 
axioms of L are L-tautologies, so all propositional axioms of L are valid in all 
L-structures. It is easy to check that all equality axioms of L are valid in all 
L-structures. In exercise 4 below the reader is asked to show that all quantifier 
axioms of L are valid in all [-structures. This finishes the proof of Proposition 
27% 


Next we introduce rules for deriving new formulas from given formulas. 


Definition. The logical rules of L are the following: 

(i) Modus Ponens (MP): From y and » > 2, infer 2). 

(ii) Generalization Rule (G): If the variable x does not occur free in y, then 
(a) from y > WV, infer yp > Va; 
(b) from w > y, infer day > ¢y. 


A key property of the logical rules is that their application preserves validity. 
Here is a more precise statement of this fact, to be verified by the reader. 
(Gi) IfARyandAE yy, then AF y. 
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(ii) Suppose x does not occur free in y. Then 
(a) ifAE yy, then AE gy > Vay; 
(b) if AE wy, then AF Ary - ». 


Definition. A formal proof, or just proof, of y from & is a sequence ¥1,..-, Yn 

of formulas with n > 1 and y, = y, such that for k =1,...,n: 

(i) either y, € ¥, 

(ii) or yp is a logical axiom, 

(iii) or there are 1,7 € {1,...,k —1} such that y, can be inferred from y; and 
yp; by MP, or from y; by G. 

We say that © proves y (notation: St wy) if there exists a proof of y from ™. 


Proposition 2.7.3. [f Ut y, then U & y. 


This follows easily from earlier facts that we stated and which the reader was 
asked to verify. The converse is more interesting, and due to Gédel (1930): 


Theorem 2.7.4 (Completeness Theorem of Predicate Logic). 


Sky = VEY 


Remark. Our choice of proof system, and thus our notion of formal proof 
is somewhat arbitrary. However the equivalence of + and / (Completeness 
Theorem) justifies our choice of logical axioms and rules and shows in particular 
that no further logical axioms and rules are needed. Moreover, this equivalence 
has consequences that can be stated in terms of | alone. An example is the 
important Compactness Theorem. 


Theorem 2.7.5 (Compactness Theorem). [fX 5 o, then there is a finite subset 
Xo of & such that Xo Eo. 


The Compactness Theorem has many consequences. Here is one. 


Corollary 2.7.6. Suppose o is an Lring-sentence that holds in all fields of 
characteristic 0. Then there exists a natural number N such that o is true in 
all fields of characteristic p > N. 


Proof. By assumption, 


FI(0) =FlU {nl 40 : n=2,3,5,...$ Eo. 


Then by Compactness, there is N € N such that 


FlU{n1 40: n=2,3,5,,...,.n<N} Eo. 


It follows that o is true in all fields of characteristic p > N. 


The converse of this corollary fails, see exercise 9 below. Note that F1(0) is 
infinite. Could there be an alternative finite set of axioms whose models are 
exactly the fields of characteristic 0? 
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Corollary 2.7.7. There is no finite set of Lring-sentences whose models are 
exactly the fields of characteristic 0. 


Proof. Suppose there is such a finite set of sentences {o1,...,0n}. Let o := 
o1/:--Aon. Then the models of o are just the fields of characteristic 0. By the 
previous result o holds in some field of characteristic p > 0. Contradiction! 


Exercises. The conventions made at the beginning of Section 2.6 about L, A, t, 
y, w, 7, © remain in force! All but the last two exercises to be done without using 
Theorem 2.7.4 or 2.7.5. Actually, (4) and (7) will be used in proving Theorem 2.7.4. 


(1) Let L = {R} where R is a binary relation symbol, and let A = (A; R) be a finite 
L-structure (i. e. the set A is finite). Then there exists an L-sentence o such that 
the models of o are exactly the L-structures isomorphic to A. (In fact, for an 
arbitrary language L, two finite L-structures are isomorphic iff they satisfy the 
same L-sentences.) 


(2) Let y and w be L-formulas, x,y variables, and t an L-term. 
(a) If y is atomic, then ¢ is free for x in y. 
(b) t is free for x in 7g iff t is free for x in y. 
(c) tis free for x in y V v iff t is free for x in y and in y; and t is free for x in 
yp Aw iff t is free for x in y and in. 
(d) t is free for x in dyy iff either x and y are different and t is free for x in y, 
or x and y are the same; likewise with Vyy. 


(3) Ift is free for y in y and ¢ is valid in A, then y(t/y) is valid in A. 
(4) Suppose t is free for y in y = y(#1,...,2n,y). Then: 


(i) Each A-instance of the quantifier axiom y(t/y) + Jyy has the form 


P(Gy.++ +s Qn T) 4 Ayel(ays---sGns¥) 


with ai1,...,@n € A and 7 a variable-free L.4-term. 


(ii) The quantifier axiom y(t/y) > dyy is valid in A. (Hint: use Lemma 2.6.2.) 
(iii) The quantifier axiom Vyy > y(t/y) is valid in A. 


(5) If y is an L-tautology, then | y. 

(6) Nt gy; fori=1,...,.n— SUE giA-:-Agn. 

(7) IfXF&epoyvandttEwyoy,then YE yey. 

(8) F 7dry 4 Vany and | Wey © Ary. 

(9) Indicate an Lring-sentence that is true in the field of real numbers, but false in 


all fields of positive characteristic. 


(10) Let o be an Lap-sentence which holds in all non-trivial torsion free abelian groups. 
Then there exists N € N such that o is true in all groups Z/pZ where p is a 
prime number and p> N. 


(11) Suppose © has arbitrarily large finite models. Then » has an infinite model. 
(Here “finite” and “infinite” refer to the underlying set of the model.) 


Chapter 3 


The Completeness Theorem 


In this chapter we prove the Completeness Theorem. As a byproduct we also 
derive some more elementary facts about predicate logic. The last section con- 
tains some of the basics of universal algebra, which we can treat here rather 
efficiently using our construction of a so-called term-model in the proof of the 
Completeness Theorem. 

Conventions on the use of L, A, t, y,v, 6, 0 and © are as in the beginning 
of Section 2.6. 


3.1 Another Form of Completeness 


It is convenient to prove first a variant of the Completeness Theorem. 


Definition. We say that © is consistent if & ¥ L, and otherwise (that is, if 
x L), we call © inconsistent. 


Theorem 3.1.1 (Completeness Theorem - second form). 
Xu is consistent if and only if & has a model. 


We first show that this second form of the Completeness Theorem implies the 
first form. This will be done through a series of technical lemmas, which are 
also useful later in this Chapter. 


Lemma 3.1.2. Suppose UF y. Then Xt Vay. 


Proof. From & + vy and the L-tautology y > (-Vry — yy) we obtain HF 
-AVayp — y by MP. Then by G we have © F -Vayp — Vay. Using the L- 
tautology (=~Vay > Vey) > Vay and MP we get }F Vay. 


Lemma 3.1.3 (Deduction Lemma). Suppose UU{o}F y. Then SF o> . 


Proof. By induction on the length of a proof of y from XU {co}. 

The cases where ¢ is a logical axiom, or py € UU {co}, or y is obtained by 
MP are treated just as in the proof of the Deduction Lemma of Propositional 
Logic. 
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Suppose that y is obtained by part (a) of G, so y is y; + Vaw where x does 
not occur free in y; and UU {a} - yw; > w, and where we assume inductively 
that UF o + (yi > w). We have to argue that then H+ 0 > (y1 > Va). 
From the L-tautology (o + (¢1 + W)) +((¢ A ¢1) 4 ) and MP we get D+ 
(oAy1) > vw. Since x does not occur free in aA y this gives HF (cAy1) > Vau, 
by G. Using the L-tautology 


((o A ¢1) 2 Vad) > (o > (91 7 Vay)) 


and MP this gives St a > (~1 > V2). 
The case that y is obtained by part (b) of G is left to the reader. 


Corollary 3.1.4. Suppose SU {o1,...,on} vy. Then SE a A...AGn > ¥. 
We leave the proof as an exercise. 

Corollary 3.1.5. St o if and only if SU {70} is inconsistent. 

The proof is just like that of the corresponding fact of Propositional Logic. 
Lemma 3.1.6. Ut Vy if and only if UF @. 


Proof. (<=) This is Lemma 3.1.2. For (=), assume © | Vyy. We have the 
quantifier axiom Vyy — vy, so by MP we get Ut yp. 


Corollary 3.1.7. UE Vyi...Vyny if and only if UE @. 


Corollary 3.1.8. The second form of the Completeness Theorem implies the 
first form, Theorem 2.7.4. 


Proof. Assume the second form of the Completeness Theorem holds, and that 
x — y. It suffices to show that then UF y. From © — y we obtain © = 
Vyi.--Vyny where y = (yi,---,;Yn), and so 4 U {-0} has no model where 
o is the sentence Vy, ...Vyny. But then by the 2"¢ form of the Completeness 
Theorem ¥ U {=o} is inconsistent. Then by Corollary 3.1.5 we have © o and 
thus by Corollary 3.1.7 we get UF y. 


We finish this section with another form of the Compactness Theorem: 


Theorem 3.1.9 (Compactness Theorem - second form). 
If each finite subset of & has a model, then % has a model. 


This follows from the second form of the Completeness Theorem. 


3.2 Proof of the Completeness Theorem 


We are now going to prove Theorem 3.1.1. Since (<) is clear, we focus our 
attention on (=), that is, given a consistent set of sentences % we must show 
that © has a model. This job will be done in a series of lemmas. Unless we say 
so, we do not assume in those lemmas that © is consistent. 
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Lemma 3.2.1. Suppose Xt » and t is free for x in py. Then St y(t/z). 


Proof. From & F y we get UF Vay by Lemma 3.1.2. Then MP together with 
the quantifier axiom Vay —> y(t/x) gives UF y(t/x) as required. 


Lemma 3.2.2. Suppose & | y, let 21,...,%n be distinct variables, and let 
ty,...,tn be terms whose variables do not occur bound in py. Then 


ue p(t: /x1, ees stn /£n). 


Proof. Take distinct variables y1,...,yn that do not occur in 9 or ¢1,...,tn 
and that are distinct from x1,...,%,. Use Lemma 3.2.1 n times in succession to 
obtain © + = where w = y(yi/21,---;Yn/Xn). Apply Lemma 3.2.1 again n times 
to get UF w(ti/yi,...,tn/Yn). To finish, observe that W(ti/y1,...,tn/Yn) = 
p(ti/21,---,tn/&n). 


Lemma 3.2.3. Let t,t',t),...,t,,... be L-terms. 
(1) (hbase: 


(2) IfXFt=V, thendSt¢’ =t. 
(3) Ife t, =te and SF tg = ts, then SF ty = ts. 
(4) Let R € L* be m-ary and suppose % + t; = th fori = 1,...,m and 


Ut Rty...tm. Then Ut Rt...ti,. 
(5) Let F € Li be n-ary, and suppose X + t; = th fori = 1,...,n. Then 
Db Ft, ...t, = Ft,...t. 


Proof. For (1), take an equality axiom x = x and apply Lemma 3.2.1. For 
(2), we take an equality axiom xz = y > y = x, apply Lemma 3.2.2 to obtain 
-t=t' +t’ =t¢, and use MP. For (3), take an equality axiom 


(cx =yAy=2) > (x@=2), 


apply Lemma 3.2.2 to get + (f; = te Atg = t3) > t = ts, use Exercise 6 in 
Section 2.7 and MP. To prove (4), take an equality axiom 


=yA..-NEm =Ym A RIy...2m 9 Ryi---Ym, 
apply Lemma 3.2.2 to obtain 
Male byte he ebay oe DRE diag cats 


and use Exercise 6 as before, and MP. Part (5) is obtained similarly by taking 
an equality axiom 71 = yi A...A%n = Yn 9 F2,...%n = Fy... Yn. 


Definition. Let Term; be the set of variable-free L-terms. We define a binary 
relation ~y on Termy, by 


ty vy te —= ME ty = te. 


Parts (1), (2) and (3) of the last lemma yield the following. 
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Lemma 3.2.4. The relation ~y is an equivalence relation on Termy. 


Definition. Suppose FL has at least one constant symbol. Then Term, is non- 

empty. We define the L-structure Ay as follows: 

(i) Its underlying set is Ay := Termy /~y. Let [t] denote the equivalence class 
of t € Termy with respect to ~y. 

(ii) If Re L* is m-ary, then R4® C A® is given by 


({ta],.-+;[tm]) € RA= => EE Ry...tm — (ta... ten. © Termz). 
iii) If F € L‘ is n-ary, then FA® : A® — Ay is given by 
a 
FA (toes (tal) = [Pies ta] Gigs gta € Term). 


Remark. The reader should verify that this counts as a definition, that is: 
in (ii), whether or not © + Rt,...t, depends only on ((ti],..-,[tm]), not on 
(t1,.--,;¢m); in (iii), [F't,...t,] depends likewise only on ([t,],..-.,[tn]). (Use 
parts (4) and (5) of Lemma 3.2.3.) 


Corollary 3.2.5. Suppose L has a constant symbol, and & is consistent. Then 
(1) for each t € Termy we have t4® = [t]; 
(2) for each atomic o we have: Dk a =} As Ko. 


Proof. Part (1) follows by an easy induction. Let o be Rt,...t, where R € L* 
is m-ary and t),...,tm € Termy. Then 


Sb Rty...tm > ((t],..-;[tm]) € RA® & As = Rt, ...tm, 


where the last “<” follows from the definition of — together with part (1). Now 
suppose that o is tj = te where t,,t2 € Termy,. Then 


Det =f > [4] = [bl oo =e SAS Eh = to. 


We also have 2 T @ Ay ET. So far we haven’t used the assumption that © 
is consistent, but now we do. The consistency of 4 means that © K L. We also 
have Ay 1 by definition of FE. Thus YF 1 @ As L. 


If the equivalence in part (2) of this corollary holds for all a (not only for atomic 
a), then Ay F %, so we would have found a model of ©, and be done. But 
clearly this equivalence can only hold for all o if © has the property that for 
each go, either % + o or N+ 7a. This property is of interest for other reasons 
as well, and deserves a name: 


Definition. We say that © is complete if © is consistent, and for each o either 
ut o or NE 70. 


Example. Let L = Day, © := Ab (the set of axioms for abelian groups), and 
o the sentence Ja(2 # 0). Then © ¥ o since the trivial group doesn’t satisfy 
ao. Also %) ¥ 70, since there are non-trivial abelian groups and o holds in such 
groups. Thus © is not complete. 
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Completeness is a strong property and it can be hard to show that a given 
set of axioms is complete. The set of axioms for algebraically closed fields of 
characteristic 0 is complete (see the end of Section 4.3). 

A key fact about completeness needed in this chapter is that any consistent 
set of sentences extends to a complete set of sentences: 


Lemma 3.2.6 (Lindenbaum). If% is consistent, then % C X’ for some complete 
set X! of L-sentences. 


The proof uses Zorn’s Lemma, and is just like that of the corresponding fact of 
Propositional Logic in Section 1.2. 

Completeness of © does not guarantee that the equivalence of part (2) of 
Corollary 3.2.5 holds for all o. Completeness is only a necessary condition 
for this equivalence to hold for all 0; another necessary condition is “to have 
witnesses” : 


Definition. A S-witness for the sentence Jry(x) is a term t € Term; such 
that © +F y(t). We say that & has witnesses if there is a U-witness for every 
sentence dxy(x) proved by &. 


Theorem 3.2.7. Let L have a constant symbol, and suppose & is consistent. 
Then the following two conditions are equivalent: 

(i) For each o we have: HEa@S As Eo. 

(ii) & is complete and has witnesses. 

In particular, if % is complete and has witnesses, then Ay is a model of %. 


Proof. It should be clear that (i) implies (ii). For the converse, assume (ii). We 
use induction on the number of logical symbols in o to obtain (i). We already 
know that (i) holds for atomic sentences. The cases that 0 = 701, 0 = 01 V 09, 
and o = 0; A o2 are treated just as in the proof of the corresponding Lemma 
2.2.12 for Propositional Logic. It remains to consider two cases: 

Case o = Ary(x): 

(=) Suppose that © + o. Because we are assuming that © has witnesses we have 
até Termy such that + y(t). Then by the inductive hypothesis Ay — y(t). 
So by Lemma 2.6.2 we have an a € Ay such that Ay — y(a). Therefore 
As - drp(x), hence Ap - a. 

(<=) Assume Ay - o. Then there is an a € Ay such that Ay - y(a). Choose 
t € Termy, such that [t] = a. Then t4* = a, hence Ay — y(t) by Lemma 2.6.2. 
Applying the inductive hypothesis we get © + y(t). This yields © + Ary(a) by 
MP and the quantifier axiom y(t) > dry(z). 

Case o = Vay(x): This is similar to the previous case but we also need the 
result from Exercise 8 in Section 2.7 that - ~Vay © dary. 


We call attention to some new notation in the next lemmas: the symbol Fy is 
used to emphasize that we are dealing with formal provability within LD. 


Lemma 3.2.8. Let & be a set of L-sentences, c a constant symbol not in L, 
and Le := LU {c}. Let y(y) be an L-formula and suppose = Fr, plc). Then 
urzp vy). 
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Proof. (Sketch) Take a proof of y(c) from & in the language L,, and take a 
variable z different from all variables occurring in that proof, and also such that 
z #y. Replace in every formula in this proof each occurrence of c by z. Check 
that one obtains in this way a proof of y(z/y) in the language L from %. So 
“Ey y(z/y) and hence by Lemma 3.2.1 we have © Fy v(z/y)(y/z), that is, 
uEr ly). 


Lemma 3.2.9. Assume ¥ is consistent and 4+ Ayp(y). Let c be a constant 
symbol not in L. Put Le := LU {ch}. Then SU {y(c)} is a consistent set of 
L.-sentences. 


Proof. Suppose not. Then }U {p(c)} Fz, L. By the Deduction Lemma (3.1.3) 
Lez, y(c) > L. Then by Lemma 3.2.8 we have HF yp y(y) > L. By G we have 


Sky, Ayy(y) > L. Applying MP yields © + L, contradicting the consistency 
of &. 

Lemma 3.2.10. Suppose % is consistent. Let 0, = dxiyi(@1), ..., Or = 
Atnpn({n) be such that & + o; for every i = 1,...,n. Let c,...,cn be 
distinct constant symbols not in L. Put L’ := LU {e1,...,en} and XY’ = 
UU {yilci),--- > Yn(en)}. Then X! is a consistent set of L’-sentences. 


Proof. The previous lemma covers the case n = 1. The general case follows by 
induction on n. 


is 


In the next lemma we use a superscript “w” for “witness.” 


Lemma 3.2.11. Suppose © is consistent. For each L-sentence 0 = Axp(x) such 
that SF a, let co be a constant symbol not in L such that if o’ is a different 
L-sentence of the form Aa'y'(x2’) provable from X%, then co # Co. Put 


LY 
ye 


LU {ces : 0 = Ary(x) is an L-sentence such that i+ o} 
SU {y(cc) 1 ¢ = Jry(z) is an L-sentence such that ut o} 


iT 


jj 


Then is a consistent set of L“”-sentences. 


Proof. Suppose not. Then ©” | L. Take a proof of L from ©” and let 
Coys+++3€s, be constant symbols in L” \ L such that this proof is a proof 
of L in the language LU {cg,,... ,Co,,} from UU {y1(Co,),--- 5 Yn(Co,,)}, where 
o; = dx,y;(a;) for 1 <i<n. So LU {yi(co,),--- 5 Yn(Co,)} is an inconsistent 
set of LU {cg,,... , Co, }-sentences. This contradicts Lemma 3.2.10. 


Lemma 3.2.12. Let Lg C Ly C Lo C ... be an increasing sequence (L,) of 
languages, and set Lo := Lie Ly. Let X, be a consistent set of L,-sentences, 
for each n, such that Xo C 4 C Xg.... Then the union Noo := U,, Un is a 
consistent set of Loo-sentences. 


Proof. Suppose that Ugo  L. Take a proof of L from %.. Then we can choose 
n so large that this is actually a proof of L from ©, in L,. This contradicts the 
consistency of Un. 
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Suppose the language L* extends L, let A be an L-structure, and let A* be an 
L*-structure. Then A is said to be a reduct of A* (and A* an expansion of A) 
if A and A* have the same underlying set and the same interpretations of the 
symbols of L. For example, (N; 0,+) is a reduct of (N; <,0,1,+,-). Note that 
any L*-structure A* has a unique reduct to an L-structure, which we indicate 
by A*|z A key fact (to be verified by the reader) is that if A is a reduct of A*, 
then t4 = t4” for all variable-free L.4-terms t, and 


AEo A* Eo 


for all L.4-sentences o. 
We can now prove Theorem 3.1.1. 


Proof. Let & be a consistent set of Z-sentences. We construct a sequence (Lp) 
of languages and a sequence (,) where each %,, is a consistent set of L,- 
sentences. We begin by setting Lo = L and No = &. Given the language L,, 
and the consistent set of D,-sentences },, put 


L a LT, if n is even, 
mi | L® if nis odd, 


choose a complete set of L,,-sentences X/, D U,, and put 


See xy, if n is even, 
nti |S" ifn is odd. 


Here Li” and & are obtained from L,, and &,, in the same way that DL” and 
u” are obtained from L and © in Lemma 3.2.11. Note that LZ, C Dn+41, and 
Lin c Mn41- 

By the previous lemma the set U of L.o-sentences is consistent. It is also 
complete. To see this, let o be an L..-sentence. Take n even and so large that o 
is an L,,-sentence. Then ©,41 / o or Uyjyy F mo and thus Uy F a or Uy 70. 

We claim that ©. has witnesses. To see this, let ¢o = drp(x) be an Lo- 
sentence such that % / a. Now take n to be odd and so large that a is 
an L,,-sentence and &,, - o. Then by construction of U,41 = Li’ we have 
Eni ples), 80 Hoo k v(cc). 

It follows from Theorem 3.2.7 that U. has a model, namely Ay. Put 
A := Ays..|z. Then A — &. This concludes the proof of the Completeness 
Theorem (second form). 


Exercises. 
(1) Sis complete if and only if © has a model and every two models of » satisfy the 
same sentences. 


(2) Let L have just a constant symbol c, a unary relation symbol U and a unary 
function symbol f, and suppose that & + Ufc, and that f does not occur in the 
sentences of ©. Then Ut VaU«a. 
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3.3. Some Elementary Results of Predicate Logic 


Here we obtain some generalities of predicate logic: Equivalence and Equality 
Theorems, Variants, and Prenex Form. In some proofs we shall take advantage 
of the fact that the Completeness Theorem is now available. 


Lemma 3.3.1 (Distribution Rule). We have the following: 


(i) Suppose U+ paw. Then S+ Ary > day and X+ Vay > Var. 


(ii) Suppose UF pow. Then S+ Ary & Ary and UE Vay & Var. 


Proof. We only do (i), since (ii) then follows easily. Let A be a model of ©. By 
the Completeness Theorem it suffices to show that then A —- day > dew and 
AE Vay > Vaw. We shall prove A - dry > day and leave the other part 
to the reader. We have A E y > w. Choose variables y1,... ,Y, such that 
py = v(@,y1,---,Yn) and w = w(x, y1,---, Yn). We need only show that then 
for all ay,...,@, €A 


A Arp(a,a,,..-,a,) > Jry(x, a,,...,a,) 


Suppose A — Ary(z,a,,...,a,,). This yields ag € A with AE p(do, a),.--,4,). 


From A — y > wv we obtain A F (ap,...,a,) 9 U(ao,---+@n), which gives 
AE w(ao,---,@,), and thus A Jrw(a,a,,...,a,). 


Theorem 3.3.2 (Equivalence Theorem). Let w’ be the result of replacing in the 
formula ~ some occurrence of a subformula yp by the formula vy’, and suppose 
that SE ype vy’. Then wv’ is again a formula and SF © yy’. 


Proof. By induction on the number of logical symbols in @. If = is atomic, then 
necessarily y = y and y’ = y’ and the desired result holds trivially. 

Suppose that w% = 76. Then either 7 = y and yw’ = vy’, and the desired 
result holds trivially, or the occurrence of y we are replacing is an occurrence 
in 9. Then the inductive hypothesis gives © + 6 4 6’, where 6’ is obtained by 
replacing that occurrence (of y) by y’. Then vw’ = —6/ and the desired result 
follows easily. The cases w = 1 V we and w = v1 A we are left as exercises. 
Suppose that q = 4x26. The case w = » (and thus w’ = ¢’) is trivial. Suppose 
w # y. Then the occurrence of y we are replacing is an occurrence inside 0. 
So by inductive hypothesis we have = 6 4 6’. Then by the distribution rule 
SF Ard & Axd’. The proof is similar if 7 = Vx0. 


Definition. We say (1 and ~e are )-equivalent if UF vy & Yo. (Incase ¥ = 0, 
we just say equivalent.) One verifies easily that “-equivalence is an equivalence 
relation on the set of L-formulas. 


Given a family (y;)jc7 of formulas with finite index set I we choose a bijection 
kw i(k): {1,...,n} > I and set 


V Pi = Pitt) Vir V Pi(n)s | Pi = Pit) N+ A Pi(n)- 
tel tel 
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If I is clear from context we just write \/, p; and /\, yi instead. Of course, these 
notations V/,-; i and A\,<; i can only be used when the particular choice of 
bijection of {1,...,n} with J does not matter; this is usually the case because 
the equivalence class of yji(1) V +++ V Yin) does not depend on this choice, and 
the same is true with “A” instead of “V”. 


Definition. A variant of a formula is obtained by successive replacements of 
the following type: 

(i) replace an occurrence of a subformula Jay by dyy(y/z); 

(ii) replace an occurrence of a subformula Vay by Vyy(y/z). 

where y is free for x in y and y does not occur free in y. 


Lemma 3.3.3. A formula is equivalent to any of its variants. 


Proof. By the Equivalence Theorem (3.3.2) it suffices to show F Jay © Jyy(y/z) 
and t Vay + Vyp(y/x) where y is free for x in y and does not occur free in y. 
We prove the first equivalence, leaving the second as an exercise. Applying G 
to the quantifier axiom y(y/x) > dry gives + Jyp(y/x) > Avy. Similarly we 
get F day > Jyy(y/x) (use that y = y(y/x)(x/y) by the assumption on y). 
An application of Exercise 7 of Section 2.7 finishes the proof. 


Definition. A formula in prenex form is a formula Q121...Qn¢%ny where 
X1,-..,Xy are distinct variables, each Q; € {4,V} and y is quantifier-free. We 
call Q12%1...QnXp the prefix, and y the matrix of the formula. Note that a 
quantifier-free formula is in prenex form; this is the case n = 0. 


We leave the proof of the next lemma as an exercise. Instead of “occurrence of 
. as a subformula” we say “part ...”. In this lemma @ denotes a quantifier, 
that is, Q € {4,V}, and Q’ denotes the other quantifier: 4’ = V and V’ = i. 


Lemma 3.3.4. The following prenex transformations always change a formula 
into an equivalent formula: 
(1) replace the formula by one of its variants; 


(2) replace a part ~Qaw by Q’ar7y; 

(3) replace a part (Qxw) V 0 by Qu(w V 0) where x is not free in 6; 
(4) replace a part ) V QO by Qa(w V @) where x is not free in YW; 
(5) replace a part (Qaw) AO by Qu(w A 0) where x is not free in 6; 
(6) replace a part y A Qx@ by Qa(w A @) where x is not free in w. 


Remark. Note that the free variables of a formula (those that occur free in the 
formula) do not change under prenex transformations. 


Theorem 3.3.5 (Prenex Form). Every formula can be changed into one in 
prenex form by a finite sequence of prenex transformations. In particular, each 
formula is equivalent to one in prenex form. 


Proof. By induction on the number of logical symbols. Atomic formulas are 
already in prenex form. To simplify notation, write py —>pr w to indicate 
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that 7 can be obtained from y by a finite sequence of prenex transformations. 
Assume inductively that 


Yl pr 2121... QmtmV1 
2 = pr Qm41Y1 28 Qm+nYnW2, 


where Q1,..-,Qm,---;Qm+n © {5,V}, 11,...,Um are distinct, y1,...,Yn are 
distinct, and 7; and wz are quantifier-free. 
Then for y := 7y~ 1, we have 


~ = pr 7Q1 21 tee QmimY1- 
Applying m prenex transformations of type (2) we get 
7Q121 see QmtmY1 => pr Qi21 tee Qn Etm7V1; 


hence y = >pr Q)a1...Q),%m71- 
Next, let » := y1 V ya. The assumptions above yield 


YP = pr (Qy21 Sse Qm@LmV1) Vv (Qm41Y1 gious QmtnYnW2)- 


Replacing first Q1271...Qm%mw1 by a variant we may assume that 


{r1,...,%m}O{y1,--- Yn} = 9D, 


and that no x; occurs free in wo. Next replace Qm+iyi---QminYnW2 by a 
variant to arrange that in addition no y; occurs free in ~,. Applying m+n 
times prenex transformation of types (3) and (4) we obtain 


(Qi21 SEs QmtmY1) Vv (Qm+4191 toe Qm+nYn2) == pr 
Qi%1 tee QmtmQm+41Y1 te Qm+nYn (V1 Vv w2). 


Hence y = >pr Q121--- QmEmQm+41Y1 «+» OminYn(%1 V Y2). Likewise, to deal 
with y) A ye, we apply prenex transformations of types (5) and (6). 

Next, let y := day. Applying prenex transformations of type (1) we 
can assume £1,...,%m differ from x. Then py = >pr 4tQ1271...Qm%@miy1, and 
drQ121...QmXmi1 is in prenex form. The case y := Vary, is similar. 


We finish this section with results on equalities. Note that by Corollary 2.1.7, 
the result of replacing an occurrence of an L-term 7 in ¢t by an L-term 7’ is an 
L-term t’. 


Proposition 3.3.6. Let 7 and 7’ be L-terms such that Str =7’, let t’ be the 
result of replacing an occurrence of T int by rt’. Then UFt=V. 


Proof. We proceed by induction on terms. First note that if t = 7, then t’ = 7’. 
This fact takes care of the case that t is a variable. Suppose t = Ft,...tn 
where F' € L/ is n-ary and t),...,t, are L-terms, and assume t # T. Using the 
facts on admissible words at the end of Section 2.1, including exercise 5, we see 
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that t’ = Ft, ...t,, where for some 7 € {1,...,n} we have tj = ¢ for all j 4 7, 
gj € {1,...,n}, and t) is obtained from t; by replacing an occurrence of 7 in t; 
by r’. Inductively we can assume that UF t; = t),...,Hb t, =U), so by part 
(5) of Lemma 3.2.3 we have NF t=t’. 


An occurrence of an L-term 7 in ¢ is said to be proper if it is not an occurrence 
immediately following a quantifier symbol. So if 7 is not a variable, then any 
occurrence of 7 in any formula is proper. If 7 is the variable x, then the second 
symbol in dxy is not a proper occurrence of 7 in Az. 


Proposition 3.3.7 (Equality Theorem). Let tT and r' be L-terms such that 
“LE r=’. Let vy’ be the result of replacing a proper occurrence of T in p by 
tT’. Then y! is an L-formula and UF yp © yy’. 

Proof. For atomic vy, argue as in the proof of Proposition 3.3.6. Next, proceed 
by induction on formulas, using the Equivalence Theorem. 


Exercises. For exercise (3) below, recall from Section 2.5 the notions of existential 
formula and universal formula. The result of exercise (4) is used in later chapters. 


(1) Let P bea unary relation symbol, Q be a binary relation symbol, and 2, y distinct 
variables. Use prenex transformations to put 


Vady(P(x) A Q(x, y)) 4 SrVy(Q(x, y) + P(y)) 
into prenex form. 


(2) Let (y:)ier be a family of formulas with finite index set J. Then 


F (20 \V ¥:) + (\/ aeei), F (ve \\ ¥%) +> (A vee): 


a 


(3) If wi(ai,...,%m) and yo(r1,...,%m) are existential formulas, then 


(y1 V y2)(@1,---;%m), (yr A p2)(21,-.-, 2m) 


are equivalent to existential formulas yY2(x1,...,@%m) and yf;(@1,...,2m). The 
same holds with “existential” replaced by “universal” . 


(4) A formula is said to be unnested if each atomic subformula has the form Ra, ... Um 
with m-ary R € L’ U{T,1,=} and distinct variables 71,...,2m, or the form 
Fay... = n4+1 with n-ary F € L‘ and distinct variables L1,-++,En41. (This 
allows T and L as atomic subformulas of unnested formulas.) Then: 

(i) for each term t(#1,...,2%m) and variable y ¢ {x1,...,¢%m} the formula 


t(a1,...,2m) =y 


is equivalent to an unnested existential formula 61(71,...,%m,y), and also to an 
unnested universal formula 02(21,...,2%m,Y)- 

(ii) each atomic formula y(y1,...,Yn) is equivalent to an unnested existential 
formula y1(y1,.--, Yn), and also to an unnested universal formula y2(y1,..., Yn). 
(iii) each formula y(y1,..., yn) is equivalent to an unnested formula y"(y1,..., Yn). 
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3.4 Equational Classes and Universal Algebra 


The term-structure Ay introduced in the proof of the Completeness Theorem 
also plays a role in what is called universal algebra. This is a general setting 
for constructing mathematical objects by generators and relations. Free groups, 
tensor products of various kinds, polynomial rings, and so on, are all special 
cases of a single construction in universal algebra. 

In this section we fix a language L that has only function symbols, including 
at least one constant symbol. So ZL has no relation symbols. Instead of “L- 
structure” we say “L-algebra”, and A, B denote L-algebras. A substructure of 
A is also called a subalgebra of A, and a quotient algebra of A is an L-algebra 
A/~ where ~ is a congruence on A. We call A trivial if |A| = 1. There is up to 
isomorphism exactly one trivial L-algebra. 


An L-identity is an L-sentence 


V@(s1(Z) = t1(£) A+++ A 8,(Z) = ita) CHaNEieeg By) 
where 71,...,%m are distinct variables and Vz abbreviates Vx, ...V%m, and 
where s1,t1,...,5n,tn are L-terms. 


Given a set & of L-identities we define a -algebra to be an L-algebra that 
satisfies all identities in %, in other words, a /-algebra is the same as a model 
of &. To such a ©} we associate the class Mod(%) of all S-algebras. A class C of 
L-algebras is said to be equational if there is a set © of L-identities such that 
C = Mod(%). 


Examples. With L = Le,, Gr is a set of L-identities, and Mod(Gr), the 
class of groups, is the corresponding equational class of L-algebras. With DL = 
Lring, Ring is a set of L-identities, and Mod(Ring), the class of rings, is the 
corresponding equational class of L-algebras. If one adds to Ring the identity 
VaVy(xy = yx) expressing the commutative law, then the corresponding class 
is the class of commutative rings. 


Theorem 3.4.1. (G.Birkhoff) Let C be a class of L-algebras. Then the class C 
is equational if and only if the following conditions are satisfied: 


BY 


closure under isomorphism: if AEC and AB, then BEC. 
the trivial L-algebra belongs to C; 


every subalgebra of any algebra in C belongs to C; 


2 

3 

4) every quotient algebra of any algebra in C belongs to C; 

5) the product of any family (A;) of algebras in C belongs to C. 


It is easy to see that if C is equational, then conditions (1)—(5) are satisfied. For 
(3) and (4) one can also appeal to the Exercises 8 and 10 of section 2.5. Towards 
a proof of the converse, we need some universal-algebraic considerations that 
are of interest beyond the connection to Birkhoff’s theorem. 
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For the rest of this section we fix a set © of L-identities. Associated to © is the 
term algebra As whose elements are the equivalence classes [t] of variable-free 
L-terms t, where two such terms s and t are equivalent iff UF s =t. 


Lemma 3.4.2. Ay is a -algebra. 
Proof. Consider an identity 
V#(s1(@) = th (@) A+++ A 8, (Z) =tp(2)), = (Lijer4 ta) 


in &, let 7 € {1,...,n} and put s = s; andt=t,. Let a1,...,a@m € Ay and put 
A = As. It suffices to show that then s(a,,...,@,,)4 = t(a@,,---,@m)4. Take 


,—m 


variable-free L-terms a1,...,Q@m such that a1 = [ai],...,@m = [am]. Then by 
part (1) of Corollary 3.2.5 we have a, = af{,...,am = a4, so 
8 Gig tape) = S(Oij eras Weel (Gis Shei. Opa 


by Lemma 2.6.1. Also, by part (1) of Corollary 3.2.5, 


s(a4, Blas Olea) = [s(aa, on ;Qm)]; tan, Be sgn P: [t(a1, ,Am)| 
Now SF s(ay,..-,Q@m) = t(a1,...,@m), so [s(Q1,.--,Q@m)] = [t(a1,.--,Am)], 
and thus s(a,,---,@p,)4 = t(@1,---;@m)74, as desired. 


Actually, we are going to show that Ay is a so-called initial -algebra. 


An initial ‘-algebra is a -algebra A such that for any -algebra 6 there is a 
unique homomorphism A —> B. 

For example, the trivial group is an initial Gr-algebra, and the ring of integers 
is an initial Ring-algebra. 

Suppose A and 6 are both initial S-algebras. Then there is a unique iso- 
morphism A — 6. To see this, let i and j be the unique homomorphisms 
A — B and B = A, respectively. Then we have homomorphisms j 07: A— A 
and 707: B — B, respectively, so necessarily joi = idg and ioj = idg, 
so 7 and j are isomorphisms. So if there is an initial -algebra, it is unique 
up-to-unique-isomorphism. 


Lemma 3.4.3. Ay is an initial U-algebra. 


Proof. Let B be any S-algebra. Note that if s,t € Termy and [s] = [¢], then 
SE s=t, so s® = t®. Thus we have a map 


As 3B, fr 2. 


It is easy to check that this map is a homomorphism Ay — B. By Exercise 4 
in Section 2.4 it is the only such homomorphism. 


Free algebras. Let J be an index set in what follows. Let A be a /-algebra 
and (a;)ier an I-indexed family of elements of A. Then J is said to be a free 
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y-algebra on (a;) if for every S-algebra B and I-indexed family (b;) of elements 
of B there is exactly one homomorphism h : A — B such that h(a;) = 6; for all 
i € I. We also express this by “(A, (a;)) is a free U-algebra”. Finally, A itself is 
sometimes referred to as a free U-algebra if there is a family (a;)j¢7 in A such 
that (A, (a;)) is a free U-algebra. 


As an example, take L = Dring and cRi := Ring U {VzVy xy = yz}, where z,y 
are distinct variables. So the cRi-algebras are just the commutative rings. Let 
Z|X1,...,Xn] be the ring of polynomials in distinct indeterminates X1,...,Xn 


over Z. For any commutative ring R and elements 6;,...,b, € R we have a 
unique ring homomorphism Z[X1,...,X,] > R that sends X; to b; for i = 
1,...,n, namely the evaluation map (or substitution map) 


Z| Kiyo: Sl F Aig nega SF Dig eee On)- 


Thus Z[X1,..., Xn] is a free commutative ring on (X;)1<i<n. 


For a simpler example, let L = Lyo := {1,-} C Lar be the language of monoids, 
and consider 


Mo := {Va(1-e«=2Ax-1=2), VavyV¥2((xy)z = z(yz))}, 


where x,y, z are distinct variables. A monoid, or semigroup with identity, is a 
model A = (A;1,-) of Mo, and we call 1 € A the identity of the monoid A, and 
- its product operation. 

Let E* be the set of words on an alphabet F, and consider E* as a monoid by 
taking the empty word as its identity and the concatenation operation (v, w) 
vw as its product operation. Then £* is a free monoid on the family (e)eex of 
words of length 1, because for any monoid 6 and elements b. € B (for e € E) 
we have a unique monoid homomorphism £* — B that sends each e € E to be, 
namely, 

€1...€n + de, +++ de,,. 


n 


Remark. If A and B are both free X-algebras on (a;) and (b;) respectively, with 
same index set J, and g: A Bandh: B > Aare the unique homomorphisms 
such that g(a;) = b; and h(b;) = a; for all i, then (ho g)(a;) = a; for all 7, so 
hog = ida, and likewise go h = idg, so g is an isomorphism with inverse h. 
Thus, given J, there is, up to unique isomorphism preserving J-indexed families, 
at most one free -algebra on an J-indexed family of its elements. 


We shall now construct free U-algebras as initial algebras by working in an 
extended language. Let L; := LDU {c;:i € I} be the language L augmented by 
new constant symbols c; for 1 € I, where new means that c; ¢ D for i € I and 
c; # c; for distinct i,j € I. So an Ly-algebra (B, (b;)) is just an L-algebra B 
equipped with an J-indexed family (b;) of elements of B. Let H; be 4 viewed 
as a set of L;-identities. Then a free U-algebra on an J-indexed family of its 
elements is just an initial 4 ;-algebra. In particular, the “;-algebra As, is a 
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free -algebra on ([c;]). Thus, up to unique isomorphism of 4;-algebras, there 
is a unique free X-algebra on an IJ-indexed family of its elements. 


Let (A, (a;)icr) be a free N-algebra. Then A is generated by (a;). To see 
why, let B be the subalgebra of A generated by (a;). Then we have a unique 
homomorphism h : A > 6 such that h(a;) = a; for all 1 € J, and then the 
composition 


A>BoA 
is necessarily id4, so B = A. 


Let B be any L-algebra, and take any family (6;);<7 in B that generates B. 
Take a free L-algebra (A, (a;)jes), and take the unique homomorphism h : 
(A, (a;)) + (B, (bj)). Then h(tA(a;,,...,4;,)) = t8(0;,,...,0;,) for all L- 
terms t(21,...,%n) and j1,...,jn € J, so h(A) = B, and thus h induces an 
isomorphism A/~;, = B. We have shown: 


Every S-algebra is isomorphic to a quotient of a free %-algebra. 


This fact can sometimes be used to reduce problems on %-algebras to the case 
of free S-algebras; see the next subsection for an example. 


Proof of Birkhoff’s theorem. Let us say that a class C of L-algebras is closed 
if it has properties (1)—(5) listed in Theorem 3.4.1. Assume C is closed; we have 
to show that then C is equational. Indeed, let U(C) be the set of L-identities 


that are true in all algebras of C. It is clear that each algebra in C is a X(C)- 
algebra, and it remains to show that every X(C)-algebra belongs to C. Here is 
the key fact from which this will follow: 


Claim. If A is an initial 5(C)-algebra, then A €C. 


To prove this claim we take A := Asyc). For s,t € Termy such that s = t 
does not belong to X(C) we pick 6,, € C such that B,, E s # t, and we 
let hsz : A > Bs x be the unique homomorphism, so hs +([s]) A hsz([t]). Let 
B :=|[Bs4 where the product is over all pairs (s,¢) as above, and let hh: A> B 
be the homomorphism given by h(a) = (hsz(a)). Note that B € C. Then h is 
injective. To see why, let s,t € Termy be such that [s] 4 [ft] in A = Aye). Then 
s =t does not belong to X(C), so hs z([s]) A Ase ([t]), and thus h([s]) 4 h([t]). 
This injectivity gives A = h(A) C B, so A €C. This finishes the proof of the 
claim. 


Now, every 4(C)-algebra is isomorphic to a quotient of a free (C)-algebra, so 
it remains to show that free H(C)-algebras belong to C. Let A be a free 4(C)- 
algebra on (a;)ier. Let Cy be the class of all L;-algebras (B, (b:)) with BEC. 
It is clear that C; is closed as a class of L;-algebras. Now, (A, (ai)) is easily 
seen to be an initial 4(C;)-algebra. By the claim above, applied to C; in place 
of C, we obtain (A, (a;)) € C7, and thus A € C. 
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Generators and relations. Let G be any set. Then we have a )-algebra A 
with a map 4: G > A such that for any /-algebra 6 and any map 7: G—> B 
there is a unique homomorphism h : A > 6 such that hou = 7; in other words, A 
is a free as a U-algebra on (tg) geq. Note that if (A’,v’) (with v’ : G > A’) has the 
same universal property as (A,v), then the unique homomorphism h : A + A’ 
such that hoz. =’ is an isomorphism, so this universal property determines the 
pair (A,v) up-to-unique-isomorphism. So there is no harm in calling (A,v) the 
free %-algebra on G. Note that A is generated as an L-algebra by 1G. 

Here is a particular way of constructing the free U-algebra on G. Take the 
language Lg := LUG (disjoint union) with the elements of G as constant 
symbols. Let &(G) be © considered as a set of Lg-identities. Then A := Asya) 
as a L-algebra with the map g++ [g] : G > Ayvq is the free 4-algebra on G. 

Next, let R be a set of sentences s(g) = t(g) where s(a1,...,%,) and 
t(@1,...,%,) are L-terms and G = (g1,.-.,;9n) € G” (with n depending on 
the sentence). We wish to construct the 4-algebra generated by G with R as set 
of relations. This object is described up-to-isomorphism in the next lemma. 


Lemma 3.4.4. There is a X-algebra A(G, R) with a mapi:G— A(G, R) such 
that: 
(1) A(G,R) © s(4G) = t(6g) for alt o(g) =4(9) in B; 


(2) for any X-algebra B and map 7: G > B with B & s(jg) = t(jg) for all 
s(g) =t(g) in R, there is a unique homomorphism h: A(G, R) > B such 
that hoi=j. 


Proof. Let =(R) := UU R, viewed as a set of Lg-sentences, let A(G,R) := 
As), and define i : G — A(G, R) by i(g) = [g]. As before one sees that the 
universal property of the lemma is satisfied. 


1The use of the term “relations” here has nothing to do with n-ary relations on sets. 


Chapter 4 


Some Model Theory 


In this chapter we first derive the Lowenheim-Skolem Theorem. Next we develop 
some basic methods related to proving completeness of a given set of axioms: 
Vaught’s Test, back-and-forth, quantifier elimination. Each of these methods, 
when succesful, achieves a lot more than just establishing completeness. 


4.1 Lowenheim-Skolem; Vaught’s Test 


Below, the cardinality of a structure is defined to be the cardinality of its un- 
derlying set. In this section we have the same conventions concerning L, A, t, 
y, wv, 8, 0 and © as in the beginning of Section 2.6, unless specified otherwise. 


Theorem 4.1.1 (Countable Lowenheim-Skolem Theorem). 
Suppose L is countable and © has a model. Then & has a countable model. 


Proof. Since Var is countable, the hypothesis that L is countable yields that the 
set of [-sentences is countable. Hence the language 


LU{c, : 2 o where o is an L-sentence of the form Jry(x)} 


is countable, that is, adding witnesses keeps the language countable. The union 
of countably many countable sets is countable, hence the set L., constructed 
in the proof of the Completeness Theorem is countable. It follows that there 
are only countably many variable-free L.-terms, hence Ay, is countable, and 
thus its reduct As, |r is a countable model of ©. 


Remark. The above proof is the first time that we used the countability of the 
set Var = {vo, V1, V2,-.-.} of variables. As promised in Section 2.4, we shall now 
indicate why the Countable Lowenheim-Skolem Theorem goes through without 
assuming that Var is countable. 

Suppose that Var is uncountable. Take a countably infinite subset Var’ C 
Var. Then each sentence is equivalent to one whose variables are all from Var’. 
By replacing each sentence in © by an equivalent one all whose variables are 
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from Var’, we obtain a countable set ©’ of sentences such that © and \’ have 
the same models. As in the proof above, we obtain a countable model of ©’ 
working throughout in the setting where only variables from Var’ are used in 
terms and formulas. This model is a countable model of ©. 


The following test can be useful in showing that a set of axioms » is complete. 


Proposition 4.1.2 (Vaught’s Test). Let L be countable, and suppose © has a 
model, and that all countable models of % are isomorphic. Then % is complete. 


Proof. Suppose 4 is not complete. Then there is o such that UF o and UF 7a. 
Hence by the L6wenheim-Skolem Theorem there is a countable model A of © 
such that AF o, and there is a countable model B of © such that BF ~c. We 
have A= B, A[-o and B -<, contradiction. 


Example. Let L = 9, so the L-structures are just the non-empty sets. Let 
X = {01,02,...} where 


L1...Ly \ Ly Lz. 


1<i<j<n 


On = 


The models of © are exactly the infinite sets. All countable models of © are 
countably infinite and hence isomorphic to N. Thus by Vaught’s Test © is 
complete. 


In this example the hypothesis of Vaught’s Test is trivially satisfied. In other 
cases it may require work to check this hypothesis. One general method in model 
theory, Back-and-Forth, is often used to verify the hypothesis of Vaught’s Test. 
The next theorem is due to Cantor, but the proof we give stems from Hausdorff 
and shows Back-and-Forth in action. To formulate that theorem we recall from 
Section 2.6 that a totally ordered set is a structure (A; <) for the language Lo 
that satisfies the following axioms (where z, y, z are distinct variables): 


Va(x fa), Vayz((e@<yAy<z)4u<2), Vay(r<yVu=yVy<2). 


A totally ordered set is said to be dense if it satisfies in addition the axiom 


Vay(a <y > dz(a<z<y)), 


and it is said to have no endpoints if it satisfies the axiom 


Vadyz(y <a < z). 
So (Q; <) and (R; <) are dense totally ordered sets without endpoints. 


Theorem 4.1.3 (Cantor). Any two countable dense totally ordered sets without 
endpoints are isomorphic. 
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Proof. Let (A; <) and (B; <) be countable dense totally ordered sets without 
endpoints. So A = {a, :n € N} and B = {b, :n © N}. We define by recursion 
a sequence (a,,) in A and a sequence (8,,) in B: Put ao := ap and io := bo. Let 
n > 0, and suppose we have distinct ag,...,@n—1 in A and distinct 8o,...,8n—1 
in B such that for all 7,7 < nm we have a; < a; => 6; < 8;. Then we define 
Qn € A and By, € B as follows: 


Case 1: n is even. (Here we go forth.) First take & © N minimal such that 
an ¢ {ao,---;Q@n—-1}; then take 1 € N minimal such that b; is situated with 
respect to o,..., Bn—1 a8 ay is situated with respect to ao,...,Qn—1, that is, | 
is minimal such that for 7 = 0,...,n — 1 we have: a; < ay = > 6; < bj, and 
Qa; > ap <=> 6; > bj). (The reader should check that such an / exists: that is 
where density and “no endpoints” come in); put @p := az and Bp, := by. 


Case 2: n is odd. (Here we go back.) First take 1 € N minimal such that 
bi ¢ {B0,---,;B8n—1}; next take k € N minimal such that a, is situated with 


respect to Ag,...,Qn—1 as by is situated with respect to o,...,Pn—1, that is, k 
is minimal such that for 7 = 0,...,n — 1 we have: a; < ay = > 6; < bj, and 
Qa, > an => 6; > by. Put By := bj and ay := ag. 

One proves easily by induction on n that then a, € {ao,...,Q@2n} and by € 


{Bo,.--,; Gon}. Thus we have a bijection a, 4 8, : A > B, and this bijection 
is an isomorphism (A; <) > (B; <). 


Let © be the set of axioms for dense totally ordered sets without endpoints as 
indicated before the statement of Cantor’s theorem. Thus © is a set of sentences 
in the language Lo. By Vaught’s Test we obtain from Cantor’s theorem: 


Corollary 4.1.4. © is complete. 


In the results below « is an infinite cardinal, construed as the set of all ordinals 
\ < « (as is usual in set theory). We have the following generalization of the 
Lowenheim-Skolem theorem. 


Theorem 4.1.5 (Generalized Lowenheim-Skolem Theorem). Suppose |L| < « 
and X has an infinite model. Then & has a model of cardinality k. 


Proof. Let {c,},<x bea family of « new constant symbols that are not in LZ and 
are pairwise distinct (that is, c, # ¢, for A < p< «). Let L’ = LU{c, : A< kw} 
and let %’ = UU {ce, Fe, :A< p< K«K}. We claim that &’ has a model. To see 
this it suffices to show that, given any finite set A C x, the set of L’-sentences 


Ua = UU {ey Few: AWE AAF wh} 


has a model. Take an infinite model A of ©. We make an L’/-expansion A, 
of A by interpreting distinct c,’s with  € A by distinct elements of A, and 
interpreting the c,’s with A ¢ A arbitrarily. Then A, is a model of Uy. 

Note that L’ also has size at most «. The same arguments we used in proving 
the countable version of the L6wenheim-Skolem Theorem show that then »’ 
has a model 8B’ = (B,(b))y<x) of cardinality at most k. We have b, ¥ b,, for 
A < 6 <k, hence B is a model of © of cardinality x. 
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The next proposition is Vaught’s Test for arbitrary languages and cardinalities. 


Proposition 4.1.6. Suppose L has size at most k, & has a model and all models 
of & are infinite. Suppose also that any two models of % of cardinality « are 
isomorphic. Then % is complete. 


Proof. Let o be an L-sentence and suppose that © - o and & KF 70. We 
will derive a contradiction. First © ¥ o means that © U {7c} has a model. 
Similarly © - —o means that © U {ao} has a model. These models must be 
infinite since they are models of ©, so by the Generalized Lowenheim-Skolem 
Theorem © U {=0} has a model A of cardinality «, and % U {co} has a model 
B of cardinality «. By assumption A = 86, contradicting that A / 70 and 
BeEo. 


We now discuss in detail an application of this generalized Vaught Test. Fix 
a field F. A vector space over F is an abelian (additively written) group V 
equipped with a scalar multiplication operation 


A,v) Rb Au: FxV—>V 


such that for all A, € F and all v,w € V, 

GQ) (A+p)u = dv F pv, 

(ii) A(u+w) = Av 4+ Au, 

(iii) lv =v, 

(iv) Qw)u = uw). 

Let Lp be the language of vector spaces over F’: it extends the language Day = 
{0, —, +} of abelian groups with unary function symbols f, one for each A € F; 
a vector space V over F' is viewed as an LDp-structure by interpreting each 
fy as the function v -> Av : VV. One easily specifies a set Up of 
sentences whose models are exactly the vector spaces over F’. Note that Up is 
not complete since the trivial vector space satisfies Vz(a = 0) but F’ viewed as 
vector space over F' does not. Moreover, if F' is finite, then we have also non- 
trivial finite vector spaces. From a model-theoretic perspective finite structures 
are somewhat exceptional, so we are going to restrict attention to infinite vector 
spaces over F’. Let x1, 2%2,... be a sequence of distinct variables and put 


OP := Up VU {4a... dary, \ xi A~x;:n=2,3,...}. 
1<i<j<n 


So the models of UF are exactly the infinite vector spaces over F’. Note that if 
F itself is infinite then each non-trivial vector space over F is infinite. 

We will need the following facts about vector spaces V and W over F. 
(Proofs can be found in many places.) 


Fact. 

(a) V has a basis B, that is, B C V, and for each vector v € V there is a 
unique family (Av)bes of scalars (elements of F) such that {b € B: Ay # 0} 
is finite and v = Moeprod. 
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(b) Any two bases B and C of V have the same cardinality. 

(c) If V has basis B and W has basis C, then any bijection B + C extends 
uniquely to an isomorphism V > W. 

(d) Let B be a basis of V. Then |V| = |B|-|F| if F or B is infinite. If F and 
B are finite, then |V| = Tale 


Theorem 4.1.7. ¥ is complete. 


Proof. Take a « > |F'|. In particular, Dp has size at most x. Let V be a vector 
space over F’ of cardinality «. Then a basis of V must also have size K by 
property (d) above. Hence any two vector spaces over F' of cardinality « have 
bases of cardinality « and thus are isomorphic by property (c). It follows by the 
Generalized Vaught Test that Uf is complete. 


Remark. Theorem 4.1.7 and Exercise 3 imply for instance that if F = R then 
all non-trivial vector spaces over F’ satisfy exactly the same sentences in Lp. 


With the generalized Vaught Test we can also prove that ACF(0) (whose models 
are the algebraically closed fields of characteristic 0) is complete. The proof is 
similar, with “transcendence bases” taking over the role of bases. The relevant 
definitions and facts are as follows. 


Let K be a field with subfield k. A subset B of K is said to be algebraically 
independent over k if for all distinct b),...,b, € B we have p(bi,...,0n) 4 0 
for all nonzero polynomials p(a1,...,@n) € k{#1,...,@n], where 71,...,Xp are 
distinct variables. A transcendence basis of K over k is a set B C K such that 
B is algebraically independent over k and K is algebraic over its subfield k(B). 


Fact. 

(a) K has a transcendence basis over k; 

(b) any two transcendence bases of K over k have the same size; 

(c) If K is algebraically closed with transcendence basis B over k and K’ is 
also an algebraically closed field extension of k with transcendence basis B’ 
over k, then any bijection B — B' extends to an isomorphism K — K’'; 

(d) if K is uncountable and |K| > |k|, then |K| =|B| for each transcendence 
basis B of K over k. 


Applying this with k = Q and k = F, for prime numbers p, we obtain that 
any two algebraically closed fields of the same characteristic and the same un- 
countable size are isomorphic. Using Vaught’s Test for models of size Xi this 
yields: 


Theorem 4.1.8. The set ACF(0) of avioms for algebraically closed fields of 
characteristic zero is complete. Likewise, ACF(p) is complete for each prime 
number p. 


If the hypothesis of Vaught’s Test or its generalization is satisfied, then many 
things follow of which completeness is only one; it goes beyond the scope of these 
notes to develop this large chapter of pure model theory, which goes under the 
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name of “categoricity in power”, but we cannot resist mentioning two remarkable 
theorems in this area. First an assumption and a definition. 

Assume LI is countable, © has a model, and all models of © are infinite. 
Given any infinite cardinal k, we say that © is «-categorical if all models of © 
of cardinality « are isomorphic. 

Mentioning cardinals here may give the wrong impression about the role 
of set theory in model theory. The key results concerning these categoricity 
notions actually show their intrinsic and robust logical nature rather than any 
sensitive dependence on infinite cardinals: 


Theorem 4.1.9. For L and & as above, the following are equivalent: 
(i) Uo is No-categorical; 


(ii) © is complete, and for any n > 1 and distinct variables x1,...,Un there 
are up to X-equivalence only finitely many L-formulas p(a1,...,2n)- 


This result dates from the 1950’s. The next theorem is due to Morley (1965), 
and is considered the first theorem in pure model theory of real depth. 


Theorem 4.1.10. With L and & as above, if % is K-categorical for some un- 
countable k, then Ui is K-categorical for every uncountable kK. 


Exercises. 

(1) Let L = {U} where U is a unary relation symbol. Consider the L-structure 
(Z; N). Give an informative description of a complete set of L-sentences true 
in (Z; N). (A description like {o : (Z; N) - co} is correct but not informative. 
An explicit, possibly infinite, list of axioms is required. Hint: Make an educated 
guess and try to verify it using Vaught’s Test or one of its variants.) 


(2) Let 44 and \2 be sets of L-sentences such that no symbol of L occurs in both 441 
and “2. Suppose 1 and Ne have infinite models. Then © U U2 has a model. 


(3) Let L = {S} where S is a unary function symbol. Consider the L-structure (Z; S) 
where S(a) =a+1 for a € Z. Give an informative description of a complete set 
of L-sentences true in (Z; S). 


4.2 Elementary Equivalence and Back-and-Forth 


In the rest of this chapter we relax notation, and just write y(a1,...,@,) for an 
La-sentence y(a,,...,@,,), where A = (A;...) is an L-structure, p(%1,.-..,%n) 
an L4-formula, and (a1,...,@,) € A”. 


In this section A and B denote L-structures. We say that A and B are elemen- 
tarily equivalent (notation: A = B) if they satisfy the same L-sentences. Thus 
by the previous section (Q; <) = (R; <), and any two infinite vector spaces 
over a given field F’ are elementarily equivalent. 


A partial isomorphism from A to B is a bijection y: X — Y with X C A and 
Y C B (so X = domain(y) and Y = codomain(y)) such that 
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(i) for all m-ary R € L* and ay,...,dm € X, 


RA (ai, 1223Qm) <=> RB (ya1, 1 +3¥Am)- 
(ii) for all n-ary F € Lf and ay,...,@n,Qn41 € X, 
FA(a,...,4n) = On41 <> FB (ya4,...,7Gn) = ¥(n41)- 


Examples. An isomorphism A —> 6 is the same as a partial isomorphism from 
A to B with domain A and codomain B. If 7: X — Y is a partial isomorphism 
from A to B, then y~! : Y > X is a partial isomorphism from B to A, and 
for any E C X the restriction y|z : E > y(£) is a partial isomorphism from 
A to B. Suppose A = (A; <) and 6 = (B; <) are totally ordered sets, and 
NEN and q,...,an € A,b),...,6n € B are such that aj < ag <-:- << an 
and b; < bg <--- < by; then the map a; +> 0b; : {a1,...,an} > {bi,..., bn} is 
a partial isomorphism from A to B. 


A back-and-forth system from A to B is a collection I’ of partial isomorphisms 
from A to B such that 

(i) (“Forth”) for all y € T and a€ A there is a y’ € T such that 7’ extends 
y and a € domain(7’); 

(ii) (“Back”) for all y € T and 6 € B there is a 7’ € T such that 7’ extends 
y and b € codomain(y’). 


If Tis a back-and-forth system from A to B, then [71 := {y-!: 7 € Th} isa 
back-and-forth system from B to A. We call A and B back-and-forth equivalent 
(notation: A =p¢r 8) if there exists a nonempty back-and-forth system from A 
to B. Hence A =p A, and if A =pr B, then B =p; A. 

Hausdorff’s proof of Cantor’s theorem in Section 4.1 generalizes as follows: 


Proposition 4.2.1. Suppose A and B are countable and A =p¢ B. Then A= B. 


Proof. Let T be a nonempty back-and-forth system from A to B. We proceed as 
in the proof of Cantor’s theorem, and construct a sequence (7,) in T such that 
each Yn41 extends y,, A = U,, domain(y,) and B = LU, codomain(y,,). Then 
the map A > B that extends each 7, is an isomorphism A —> B. 


In applying this proposition and the next one in a concrete situation, the key is 
to guess a back-and-forth system. That is where insight and imagination (and 
experience) come in. The next result has no countability assumption. 


Proposition 4.2.2. If A=p- B, then A= B. 


Proof. Suppose [ is a nonempty back-and-forth system from A to B. We claim 
that for any L-formula y(y1,...,Yn) and all y € T and aj,...,a, € domain(y), 


AF y(ai,..-,4n) == BE v(yai,..-, Yan). 


(For n = 0 this gives A = B, but the claim is much stronger.) Exercise 4 
of Section 3.3 shows that it is enough to prove this claim for unnested y. 
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We proceed by induction on the number of logical symbols in unnested for- 
mulas y(y1,---,Yn). The case of unnested atomic formulas follows directly 
from the definition of partial isomorphism. The connectives =,V,/A present 
no problem. For drwv(x,y1,.--,;Yn), use the back-and-forth property. As to 
Vow(2,Y1,---;Yn), use that it is equivalent to ~dx7w (x, y1,.--,Yn)- 


Exercises. 

(1) Define a finite restriction of a bijection y: X > Y to bea map 9|E: E > 7(E£) 
with finite F C X. If T is a back-and-forth system from A to B, so is the set of 
finite restrictions of members of I. 


(2) If A=pbr B and B =p C, then A =pe C. 


4.3. Quantifier Elimination 


First an example from high school algebra. The ordered field of real numbers is 
the structure (R; <,0,1,+,—,-). In this structure the formula 


y(a, b,c) == Ay(ay” + by +c = 0) 


is equivalent to the q-free formula 


(a #0Ab? > 4ac) V (a=0Ab40)V(a=0Ab=0Ac=0). 


(Here the coefficients a,b,c are free variables and the “unknown” y is existen- 
tially quantified.) This equivalence gives an effective test for the existence of a 
y with a certain property, which avoids in particular having to check infinitely 
many values of y (even uncountably many in the case above). This illustrates 
the kind of property quantifier elimination is. 

Another example: in every field, the formula 


Vy((ax + by = 0A cx + dy = 0) > y=0) 


is equivalent to the q-free formula ad 4 bc. Roughly speaking, the role of de- 
terminants, discriminants, resultants, and the like is to eliminate a (quantified) 
variable. 


The role of the general coefficients a,b,c,d in these examples is taken over in 
this section by a tuple « = (a1,...,2,) of distinct variables. 


Definition. © has quantifier elimination (QE) if every L-formula y(x) is U- 
equivalent to a quantifier free (short: q-free) L-formula yp“ (2). 


By taking n = 0 in this definition we see that if & has QE, then every L-sentence 
is U-equivalent to a q-free L-sentence. 


Lemma 4.3.1. Suppose 4 has QE and B andC are models of % with a common 
substructure A (we do not assume A/S). Then B andC satisfy the same La- 
sentences. 
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Proof. Let o be an L4-sentence. We have to show B Ea @C Eo. Write a 
as y(a) with v(x) an L-formula and a € A”. Take a q-free L-formula yp“ (z) 
that is D-equivalent to y(x). Then Bk o iff B E y%(a) iff A E y%(a) (by 
Exercise 8 of Section 2.5) iff C K y“(a) (by the same exercise) iff C Eo. 


Corollary 4.3.2. Suppose % has a model, has QE, and there exists an L- 
structure that can be embedded into every model of &. Then & is complete. 


Proof. Take an L-structure A that can be embedded into every model of 4. Let 
B and C be any two models of ©. So A is isomorphic to a substructure of 6 and 
of C. Then by a slight rewording of the proof of Lemma 4.3.1 (considering only 
L-sentences), we see that B and C satisfy the same L-sentences. It follows that 
is complete. 


Remark. We have seen that Vaught’s test can be used to prove completeness. 
The above corollary gives another way of establishing completeness, and is often 
applicable when the hypothesis of Vaught’s Test is not satisfied. Completeness 
is only one of the nice consequences of QE, and the easiest one to explain at this 
stage. The main impact of QE is rather that it gives access to the structural 
properties of definable sets. This will be reflected in exercises at the end of 
this section. Applications of model theory to other areas of mathematics often 
involve QE as a key step. 


A basic conjunction in L is by definition a conjunction of atomic and negated 
atomic L-formulas. Each q-free L-formula v(x) is equivalent to a disjunction 
pila) V--+-Vyp(a) of basic conjunctions y;(x) in L (“disjunctive normal form” ). 
In what follows y is a single variable distinct from the variables 71,...,%p ina 
tuple x = (a1,...,%n). 


Lemma 4.3.3. Suppose that for every basic conjunction 6(a,y) in L there is a 
q-free L-formula 0% (a2) such that 


Db AyO(2,y) o O(a). 


Then % has QE. 


Proof. Let us say that an L-formula y() has ©-QE if it is 4-equivalent to a q- 
free L-formula yp“ (x). Note that if the L-formulas y1(x) and yo(x) have 5-QE, 
then 71(2), (~1 V ¥2)(x), and (Y1 A Y2)(x) have Z-QE. 

Next, let v(x) = Ayw(x,y), and suppose inductively that the L-formula 
w(a,y) has 5-QE. Hence w(x, y) is X-equivalent to a disjunction \/, v(x, y) of 
basic conjunctions w;(2,y) in L, with i ranging over some finite index set. In 
view of the equivalence of dy \/; vi(z,y) with V/; dyvi(z, y) we obtain 


Dt v(x) > VV) Aydi(, y). 


Each dyi);(x, y) has ©-QE, by hypothesis, so v(x) has =-QE. 
Finally, let y(a) = Vyw(x,y), and suppose inductively that the L-formula 
w(x, y) has U-QE. This case reduces to the previous case since y(x) is equivalent 


to ~dy7y (a, y). 
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In the following theorem, let © be the set of axioms for dense totally ordered 
set without endpoints (in the language Lo). 


Theorem 4.3.4. © has QE. 


Proof. Let (a,y) = (@1,..-,2n,y) be a tuple of n + 1 distinct variables, and 
consider a basic conjunction y(z, y) in Lo. By Lemma 4.3.3 it suffices to show 
that Jyy(a,y) is U-equivalent to a q-free formula ~(x). We may assume that 
each conjunct of y is of one of the following types: 


y= %, u<y,; Y< ay (l<i<n). 


To justify this, observe that if we had instead a conjunct y 4 x; then we could 
replace it by (y < aj) V (a < y) and use the fact that dy(yi(a,y) V ye(z, y)) 
is equivalent to dyyi(x,y) V dype(a,y). Similarly, a negation 7=(y < 2;) can 
be replaced by the disjunction y = 7; V x; < y, and likewise with negations 
(a; < y). Also conjuncts in which y does not appear can be eliminated because 


y((x) A O(a, y)) > W(x) A AyA(a, y). 


Suppose that we have a conjunct y = 2; in y(x,y), so, p(x, y) is equivalent to 
y= a2; A¢'(x,y), where y’(x,y) is a basic conjunction in Lo. Then Ayy(z, y) 
is equivalent to y’(a,x;), and we are done. So we can assume also that y(z, y) 
has no conjuncts of the form y = 2. 

After all these reductions, and after rearranging conjuncts we can assume 
that y(z,y) is a conjunction 


\ai<ur \yK<; 


i€l jeJ 


kK 


where J, J C {1,...,n} and where we allow I or J to be empty. Up till this 
point we did not need the density and “no endpoints” axioms, but these come 
in now: Jyy(a,y) is U-equivalent to the formula 


\ Ti < X;. 


1E1,9E7 


We mention without proof two important examples of QE, and give a complete 
proof for a third example in the next section. The following theorem is due to 
Tarski and (independently) to Chevalley. It dates from around 1950. 


Theorem 4.3.5. ACF has QE. 


Clearly, ACF is not complete, since it says nothing about the characteristic: it 
doesn’t prove 1+ 1 = 0, nor does it prove 1+ 140. However, ACF(0), which 
contains additional axioms forcing the characteristic to be 0, is complete by 4.3.2 
and the fact that the ring of integers embeds in every algebraically closed field 
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of characteristic 0. Tarski also established the following more difficult theorem, 
which is one of the key results in real algebraic geometry. (His original proof is 
rather long; there is a shorter one due to A. Seidenberg, and an elegant short 
proof by A. Robinson using a combination of basic algebra and elementary 
model theory.) 


Definition. RCF is a set of axioms true in the ordered field (R; <,0,1,—,+,-) 
of real numbers. In addition to the ordered field axioms, it has the axiom 
Va (x > 0 + Ay (x = y”)) (x,y distinct variables) and for each odd n > 1 the 
axiom 


Vaz...¢ndy (y” + ayy" 1 +-+-+a, =0) 


where 21,...,%n,y are distinct variables. The models of RCF are known as real 
closed ordered fields. 


Theorem 4.3.6. RCF has QE and is complete. 


Exercises. In (5) and (6), an L-theory is a set T of L-sentences such that for all 
L-sentences o, if To, then o € T. An aziomatization of an L-theory T is a set © of 
L-sentences such that T = {0 : o is an L-sentence and Ut o}. 


(1) The subsets of C definable in (C; 0,1,—,+,-) are exactly the finite subsets of C 
and their complements in C. (Hint: use the fact that ACF has QE.) 


(2) The subsets of R definable in the ordered field (R; <,0,1,—,+,-) of real numbers 
are exactly the finite unions of intervals of all kinds (including degenerate intervals 
with just one point) (Hint: use the fact that RCF has QE.) 


(3) Find a set Eq, of sentences in the language L = {~} where ~ is a binary relation 
symbol, whose models are the L-structures A = (A; ~) such that: 
(i) ~ is an equivalence relation on A; 
(ii) every equivalence class is infinite; 
(iii) there are infinitely many equivalence classes. 
Show that Eq,, admits QE and is complete. (It is also possible to use Vaught’s 
test to prove completeness.) 


(4) Suppose that a set © of L-sentences has QE. Let the language L’ extend L by 
new symbols of arity 0, and let =’ D © be a set of L’-sentences. Then ’ (as a 
set of L'-sentences) also has QE. 


(5) Suppose the L-theory T has QE. Then T has an axiomatization consisting of 
sentences Vedyy(x, y) and Vr7)(x) where v(x, y) and 7(x) are q-free. (Hint: let 
x be the set of L-sentences provable from 7 that have the indicated form; show 
that % has QE, and is an axiomatization of T.) 


(6) Assume the L-theory T has built-in Skolem functions, that is, for each basic 
conjunction v(x, y) there are L-terms ti(x),...,tk(x) such that 


TE Aye(a,y) > ola, ti(x)) V+ V (a, te(@)). 


Then T has QE, for every y(x,y) there are L-terms t;(x),...,t,(x) such that 
TF Ayp(2,y) > (a, ti(x)) V--- V y(a@,te(x)), and T has an axiomatization 
consisting of sentences Vxw(x) where 7(x) is q-free. 


70 CHAPTER 4. SOME MODEL THEORY 


4.4  Presburger Arithmetic 


In this section we consider in some detail one example of a set of axioms that 
has QE, namely “Presburger Arithmetic.” Essentially, this is a complete set 
of axioms for ordinary arithmetic of integers without multiplication, that is, 
the axioms are true in (Z; 0,1,+,—,<), and prove every sentence true in this 
structure. There is a mild complication in trying to obtain this completeness 
via QE: one can show (exercise) that for any q-free formula y() in the language 
{0,1,+,—,<} there is an N € N such that either (Z; 0,1,+,—-,<) — y(n) for 
alln > N or (Z; 0,1,+,-,<) F y(n) for all n > N. In particular, formulas 
such as dy(a# = y+ y) and dy(a = y+y+y) are not U-equivalent to any q-free 
formula in this language, for any set © of axioms true in (Z; 0,1,+,—,<). 

To overcome this obstacle to QE we augment the language {0,1,+,—,<} 
by new unary relation symbols P,, P2, P3, Py,... to obtain the language Lp; 
of Presburger Arithmetic (named after the Polish logician Presburger who was 
a student of Tarski). We expand (Z; 0,1,+,—,<) to the Lp,a-structure 


Z = (Z; 0,1,+,-, <, Z, 2Z,3Z,4Z,...) 
that is, P, is interpreted as the set nZ. This structure satisfies the set PrA of 
Presburger Axioms which consists of the following sentences: 


(i) the axioms of Ab for abelian groups; 

(ii) the axioms in Section 4.1 expressing that < is a total order; 
(iii) VaVyVz(a <y > a+2z<yt2z) (translation invariance of <); 
( 

( 

( 


iv) 0< 1A -7dy(0 < y < 1) (discreteness axiom); 
v) VadyVocpent =rytrl, n=1,2,3,... (division with remainder); 
vi) Vol Pas © dy(a = ny)), n=1,2,3,... (defining axioms for P,, Po,...). 


Here we have fixed distinct variables x,y,z for definiteness. In (v) and in the 
rest of this section r ranges over integers. Note that (v) and (vi) are infinite 
lists of axioms. Here are some elementary facts about models of PrA: 


Proposition 4.4.1. Let A = (A; 0,1,+,—,<, Py, Pt, Pf,...)  PrA. Then 

(1) There is a unique embedding Z —> A; it sendsk EZ tokle A. 

(2) Given anyn > 0 we have PA = nA, where we regard A as an abelian group, 
and A/nA has exactly n elements, namely 0+ nA,...,(n—1)14+ nA. 

(3) For anyn>0 andaée A, exactly one of the a,a+1,...,a+(n—1)1 lies 
in nA; 

(4) A is torsion-free as an abelian group. 


Theorem 4.4.2. PrA has QE. 


Proof. Let (a,y) = (#1,..-,2n,y) be a tuple of n + 1 distinct variables, and 
consider a basic conjunction y(z,y) in Lpra. By Lemma 4.3.3 it suffices to 
show that Jyp(x, y) is PrA-equivalent to a q-free formula w(x). We may assume 
that each conjunct of y is of one of the following types, where m, N are natural 
numbers > 1 and t(a) is an Lp,;a-term: 


my=t(x), my<t(x), ta)<my, Pn(my+t(a)). 


4.4. PRESBURGER ARITHMETIC 71 


To justify this assumption observe that if we had instead a conjunct my 4 t(x) 
then we could replace it by (my < t(x)) V (t(x) < my) and use the fact that 


Fy(yi(x,y) V ve(x,y)) is equivalent to dypi(x,y) V dyye(x,y). Similarly, a 
negation ~Py (my + t(a)) can be replaced by the disjunction 


Py(my + t(a) +1) V...V Py(my + t(x) + (n — 1)1) 


Also conjuncts in which y does not appear can be eliminated because 


F Ay (h(x) A O(a, y)) > P(x) A AyO(z, y). 


Since PrA F Py(z) © P,-n(rz) for r > 0 we can replace Py(my + t(x)) by 
P,n(rmy + rt(x)). Also, for r > 1 we can replace my = t(x) by rmy = rt(x), 
and likewise with my < t(x) and t(a) < my. We can therefore assume that 
all conjuncts have the same “coefficient” m in front of the variable y. After all 
these reductions, and after rearranging conjuncts, p(x, y) has the form 


J my = tax )A A tax )<myd A my < tj(a AK Pro (my + tr(a)) 
heH wel ged kek 


where m > 1, H,I, J, K are disjoint finite index sets, and each N(k) is a natural 
number > 1. We allow some of these index sets to be empty in which case the 
corresponding conjunction can be left out. 

Suppose that H #4 0, say h’ © H. Then the formula Ayy(z,y) is PrA- 
equivalent to 


Pwr(tni(x)) A f\ tr(a) = trea) A A ti(@) < tre(@) A A tre(2) < t;(e) 
heH ier JET 
A \ Prva (ta:(a) + te(2)) 
kek 


For the rest of the proof we assume that H = 0. 

To understand what follows, it may help to focus on the model Z, although 
the arguments go through for arbitrary models of PrA. Fix any value a € Z” 
of «. Consider the system of linear congruences (with “unknown” y) 


Pyry(my + te(@)), (ke k), 
which in more familiar notation would be written as 
my +t,(a)=0 mod N(k), (k € K). 


The solutions in Z of this system form a union of congruence classes modulo 
N := [Inex N(k), where as usual we put N = 1 for K = 9. This suggests 
replacing y successively by Nz, 1+ Nz,...,(.N—1)1+ Nz. Our precise claim is 
that dyy(az,y) is PrA-equivalent to the formula 6(a) given by 
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N-1 
V/ ( A Puce) ((mr)1 + te(@)) A Az (Ane < m(r1 + Nz) 
kek tel 
A Am (rl+ Nz) <1(0))) 


JET 
We prove this equivalence with 6() as follows. Suppose 
A=(A;...) E PrA, a= (qaj,...,d,) € A”. 


We have to show that A — Jyy(a,y) if and only if A — @(a). So let b € A be 
such that A — (a,b). Division with remainder yields a c € A and an r such 
that b=rl+Neand0<r< N-—1. Note that then fork € Kk, 


mb + ty(a) = m(r1+ Ne) + ty(a) = (mr)1+(mN)ct te(a) € N(k)A 
and so AE Py x) ((mr)1 + te(a)). Also, 


ti(a) << m(r1+ Nc) for every ic J, 
m(rl+ Nc) <t;(a) for every j € J. 


Therefore A - 6(a) with Jz witnessed by c. For the converse, suppose that the 
disjunct of 0(a) indexed by a certain r € {0,...,N — 1} is true in A, with dz 
witnessed by c € A. Then put b = rl1+Nc and we get AE (a,b). This proves 
the claimed equivalence. 

Now that we have proved the claim we have reduced to the situation (after 
changing notation) where H = K = @ (no equations and no congruences). So 
y(a,y) now has the form 


\ ti(x) << my A \ my < t;(x). 
ic] jed 
If J = 0 or J = @ then PrA | Ayy(z,y) + T. This leaves the case where 

both I and J are nonempty. So suppose A — PrA and that A is the underlying 
set of A. For each value a € A” of x there is i9 € I such that t,,(a) is maximal 
among the t;(a) with i € I, and a jp € J such that t,,(a) is minimal among the 
t;(a) with 7 € J. Moreover each interval of m successive elements of A contains 
an element of mA. Therefore dyy(x,y) is equivalent in A to the disjunction 
over all pairs (ig, j9) € I x J of the q-free formula 


A tila) Stig) A [\ tio(a) < t)(2) 


wel jEd 


AV (Pot a2) +rlya (t; (x) +11 <ty,(@))), 


This completes the proof. Note that Lp;a does not contain the relation symbol 
<; we just write t < t’ to abbreviate (t < t/) V (t=1’). 
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Remark. It now follows from Corollary 4.3.2 that PrA is complete: it has QE 
and Z can be embedded in every model. 


Discussion. The careful reader will have noticed that the elimination procedure 
in the proof above is constructive: it describes an algorithm that, given any basic 
conjunction y(z,y) in Lp;a as input, constructs a q-free formula (a) of Dp,a 
such that PrA + Jyy(a2,y) @ v(x). In view of the equally constructive proof 
of Lemma 4.3.3 this yields an algorithm that, given any Lp,a-formula v(x) as 
input, constructs a q-free Lp,a-formula y“(x) such that PrA + v(x) o yp" (2). 
(Thus PrA has effective QE.) 

In particular, this last algorithm constructs for any Lp,;a-sentence o a q-free 
Lp;a-sentence 0% such that PrA - ¢ © ao. Since we also have an obvious 
algorithm that, given any q-free Lp,a-sentence a, checks whether o is true 
in Z, this yields an algorithm that, given any Lp,a-sentence o, checks whether 
o is true in Z. Thus the structure Z is decidable. (A precise definition of 
decidability will be given in the next Chapter.) The algorithms above can 
easily be implemented by computer programs. 

Let some L-structure A be given, and suppose we have an algorithm for 
deciding whether any given L-sentence is true in A. Even if we manage to write 
a computer program that implements this algorithm, there is no guarantee that 
the program is of practical use, or feasible: on some moderately small inputs 
it might have to run for 10!°° years before producing an output. This bad 
behaviour is not at all unusual: no (classical, sequential) algorithm for deciding 
the truth of Lp,;a-sentences in Z is feasible in a precise technical sense. Results 
of this kind belong to complexity theory; this is an area where mathematics 
(logic, number theory,...) and computer science interact. 

There do exist feasible integer linear programming algorithms that decide 
the truth in Z of sentences of a special form, and this shows another (very 
practical) side of complexity theory. 

A positive impact of QE is that it yields structural properties of definable 
sets, as in Exercises (1) and (2) of Section 4.3, and as we discuss next for Z. 


Definition. Let d be a positive integer. An arithmetic progression of modulus 
d is a set of the form 


{ke Z:k=r modd, a<k< p}, 
where r € {0,...,d-1},a,8 € ZU{-~w,+oo}, a < 8. 
We leave the proof of the next lemma to the reader. 


Lemma 4.4.3. Arithmetic progressions have the following properties. 

(1) IfP,Q CZ are arithmetic progressions of moduli d and e respectively, then 
PQ is an arithmetic progression of modulus lcm(d, e). 

(2) If P C Z is an arithmetic progression, then Z~ P is a finite union of 
arithmetic progressions. 

(3) Let P be the collection of all finite unions of arithmetic progressions. Then 
P contains with any two sets X,Y also X UY, XNY,XWNY. 
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Corollary 4.4.4. Let S C Z. Then 
S is definable in Z <> S is a finite union of arithmetic progressions. 


Proof. (€) It suffices to show that each arithmetic progression is definable in Z; 
this is straightforward and left to the reader. (=) By QE and Lemma 4.4.3 it 
suffices to show that each atomic Lp,a-formula v(x) defines in Z a finite union 
of arithmetic progressions. Every atomic formula v(x) different from T and L 
has the form t1(”) < tg(a) or the form t1(x) = te(x) or the form Py(t(x)), where 
ti(x), te(a) and t(x) are Lp,a-terms. The first two kinds reduce to t(x) > 0 
and ¢(x) = 0 respectively (by subtraction). It follows that we may assume that 
(a) has the form kx +11 > 0, or the form kx +11 = 0, or the form Py(ka +11), 
where k,l € Z. Considering cases (k = 0, k 40 and k =0 mod d, and so on), 
we see that such a y(a) defines an arithmetic progression. 


Exercises. 
(1) The set 2Z cannot be defined in the structure (Z; 0,1,+,—, <) by a q-free formula 
of the language {0,1,+,—,<}. 


4.5 Skolemization and Extension by Definition 


In this section L is a sublanguage of L’, ¥ is a set of L-sentences, and ’ is a 
set of L’-sentences with © C b’. 


Definition. 5’ is said to be conservative over & (or a conservative extension 
of %) if for every L-sentence o, 


tp o <> eyo. 


Here (==>) is the significant direction, since (<=) is automatic. Note: 

(1) If }’ is conservative over }, then: » is consistent <= DX’ is consistent. 

(2) If each model of © has an L’-expansion to a model of }’, then X’ is con- 
servative over ©. (This follows easily from the Completeness Theorem.) 


Proposition 4.5.1. Let y(x,y) be an L-formula, x = (£1,...,%m). Let fp be 
an m-ary function symbol not in L, and put L' := LU{f,} and 


YY := YU {Vve(dyy(z,y) > (2, fo(z)))} 
where Vx := Vx 1...Vtm. Then X’ is conservative over d. 


Proof. Let A be any model of ©. By (2) above it suffices to obtain an L’- 
expansion A’ of A that makes the new axiom about f,, true. Choose a function 
f : A” —> A as follows. For any a € A", if there is a b € A such that 
A FE (a,b) then we let f(a) be such an element b, and if no such b exists, we 
let f(a) be an arbitrary element of A. Interpreting f, as the function f gives 
an L’-expansion A’ of A with 


A’ Ayy(z,y) > 9(2, fo(z)) 


as desired. 


4.5. SKOLEMIZATION AND EXTENSION BY DEFINITION 75 


Remark. A function f as in the proof is called a Skolem function in A for the 
formula y(z,y). It yields a “witness” for each relevant m-tuple. 


Definition. Given an L-formula y(x) with « = (21,...,%m), let R, be an 
m-ary relation symbol not in L, and put Ly := LU{R,} and 


Mei= =U {Ve(yv(@) o R(z))}. 


The sentence Va(y(x) ++ R,(x)) is called the defining axiom for R,. We call 
Ly an extension of & by a definition for the relation symbol Ry. 


Remark. Each model A of © has a unique L,-expansion A, - Ly. Every 
model of Xy, is of the form A, for a unique model A of &. 


Proposition 4.5.2. Let yp = v(x) be as above. Then: 

(1) Xy is conservative over X. 

(2) For each Ly-formula w(y) where y = (y1,---,Yn) there is an L-formula 
w*(y), called a translation of w(y), such that Uy F W(y) & v*(y). 

(3) Suppose AES and SC A™. Then S is 0-definable in A if and only if S 
is 0-definable in A,, and the same with definable instead of 0-definable. 


Proof. (1) is clear from the remark preceding the proposition, and (3) is im- 
mediate from (2). To prove (2) we observe that by the Equivalence Theorem 
(3.3.2) it suffices to prove it for formulas w(y) = Ryti(y)...tm(y) where the t; 
are L-terms. In this case we can take 


du... dum(ur = tily) A... A Um = tm(y) A g(ur/21,-.-.-,Um/2m)) 


as w*(y) where the variables u1,..., tm, do not appear in y and are not among 
Y15-++5Un- 
Definition. Suppose v(x, y) is an L-formula where (x,y) = (@1,...,%m,y) isa 


tuple of m+ 1 distinct variables, such that © + Vrd'yy(a,y), where S'yy(a, y) 
abbreviates Sy(y(x,y) A Vz(y(x, z) + y = 2)), with z a variable not occurring 
in y and not among 2%1,...,%m,y. Let f, be an m-ary function symbol not in 
Land put L’:= LU{f,} and 


W:= YU {Vry(za, fe(x))} 


The sentence Vry(z, fip(x)) is called the defining axiom for f,. We call X’ an 
extension of & by a definition for the function symbol fi. 


Remark. Each model A of © has a unique L’-expansion A’ - ©’. Every model 
of ©’ is of the form A’ for a unique model A of ©. Proposition 4.5.2 goes through 
when L,, Uy, and A are replaced by L’, &’, and A’, respectively. We leave the 
proof of this as an exercise. (Hint: reduce the proof of the analogue of (2) to 
the case of an unnested formula.) 
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Definitional expansions. It is also useful to consider expansions of a structure 
A by several primitives, each 0-definable in A. To discuss this situation, let 
A’ be an L’-expansion of the L-structure A. Then we call A’ a definitional 
expansion of A if for each symbol s € L’ \ L the interpretation s“” of s in A’ 
is 0-definable in A. Assume A’ is a definitional expansion of A. Take for each 
m-ary relation symbol R of L’ \ L an L-formula yr(x1,...,%m) that defines 
the set R4’ C A™ in A, and for each n-ary function symbol F of L/ \ L an 
L-formula yr(21,...,2n,y) that defines the graph of the map FA’; A” + Ain 
A. For R as above, call the sentence 


V2... Vim(Re1 1.2m <> prR(XQ,... ea) 
the defining axiom for R, and for F' as above, call the sentence 
Va1.. VanVy (Fx 1. In =Y > or(21,... tay) 


the defining axiom for F’. Let D be the set of defining axioms for the symbols 
in L’ \ L obtained in this way. So D is a set of L’-sentences. 


Lemma 4.5.3. For each L'-formula y'(y), where y = (y1,---,Yn), there is an 
L-formula y(y) such that Di y'(y) <> vly). 


The proof goes by induction on formulas using the Equivalence Theorem and 
is left to the reader. (It might help to restrict first to unnested formulas; see 
Section 3.3, Exercise (4).) Assume now that L and L’ are finite, so D is finite. 
Then the proof gives an effective procedure that on any input y’(y) as in the 
lemma gives an output y(y) with the property stated in the lemma. 


Defining A in 6. Before introducing the next concept we consider a simple 
case. Let (A;<) be a totally ordered set, A 4). By a definition of (A;<) ina 
structure B we mean an injective map 6: A > B*, with k € N, such that 
(i) 6(A) C BF is definable in B. 
(ii) The set {(5(a),6(b)) : a<bin A} C (B*)? = B?* is definable in B. 
For example, we have a definition 5 : Z + N? of the ordered set (Z; <) of integers 
in the additive monoid (N;0,+) of natural numbers, given by 6(n) = (n,0) and 
6(—n) = (0,7). (We leave it to the reader to check this.) 

It can be shown with tools a little beyond the scope of these notes that no 
infinite totally ordered set can be defined in the field of complex numbers. 


In order to extend this notion to arbitrary structures A = (A;...) we use the 
following notation and terminology. Let X,Y be sets, f : X — Y a map, and 
SCX”. Then the f-image of S is the subset 


F(S) = {(F(@1), «+5 F(@n)) + (@15 +--+, &n) € St 
of Y". Also, given k € N, we use the bijection 


Ci ie et Ua eh Ped) > iis. cies Baise ek) 


from (Y*)" to Y"* to identify these two sets. 
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Definition. A definition of an L-structure A in a structure 6 is an injective 

map 6: A-+ B*, with k € N, such that 

(i) 65(A) C BF is definable in B; 

(ii) for each m-ary R € L* the set 6(R4) C (B*)™ = B™* is definable in B; 

(iii) For each n-ary F € L* the set 6(graph of FA) C (BR)"t+} = BM+Dk jg 
definable in B. 


Remark. Here 6 is a structure for a language L* that may have nothing to 
do with the language L. Replacing everywhere “definable” by “O-definable”, we 
get the notion of a 0-definition of A in B. 

A more general way of viewing a structure A as in some sense living inside 
a structure B is to allow 6 to be an injective map from A into B*/E for some 
equivalence relation E on B* that is definable in B, and imposing suitable 
conditions. Our special case corresponds to E = equality on B®. (We do not 
develop this idea here further: the right setting for it would be many-sorted 
structures, rather than our one-sorted structures.) 


Recall that by Lagrange’s “four squares” theorem we have 
N= {a? +b? +c? 4d’: a,b,c,d€ Zh. 


It follows that the inclusion map N > Z is a 0-definition of (N; 0,+,-,<) in 
(Z; 0,1,+,—,-). The bijection 


a+ bir (a,b):C > R? (a,b € R) 


is a 0-definition of the field (C; 0,1,+,—,-) of complex numbers in the field 
(R; 0,1,+,-—,-) of real numbers. 

On the other hand, there is no definition of the field of real numbers in 
the field of complex numbers: this follows from the fact, stated earlier without 
proof, that no infinite totally ordered set admits a definition in the field of 
complex numbers. (A special case says that R, considered as a subset of C, is 
not definable in the field of complex numbers; this follows easily from the fact 
that ACF admits QE, see Section 4.3, exercise (1).) Indeed, it is known that the 
only fields definable in the field of complex numbers are finite fields and fields 
isomorphic to the field of complex numbers itself. 


Proposition 4.5.4. Let 6: A — B* be a0-definition of the L-structure A in the 
L*-structure B. Let x1,...,Un be distinct variables (viewed as ranging over A ), 
and let v11,...,1k,---;@n1;---;Unk be nk distinct variables (viewed as ranging 
over B). Then we have a map that assigns to each L-formula y(a1,...,2n) an 
L*-formula 6p(@11,--+;€1k3-++;2nis-+-;Lnk) such that 


5(p4) = (69)? C B™. 


In particular, for n = 0 the map above assigns to each L-sentence o an L*- 
sentence do such that AE o => BE do. 
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This proposition is byproduct of what follows, and is good enough for model- 
theoretic purposes, but for use in the next Chapter we need to be a bit more 
precise. In particular, we assume below that the variables are just vo, v1, V2,..-. 
(Of course, Proposition 4.5.4 is not affected by this assumption.) 


Let languages L and L* be given. Given any 0-definition 6 : A > B* of an 
L-structure A = (A;...) into an L*-structure B = (B;...), we shall translate 
any L-formula about A into an equivalent L*-formula about B. But what is 
meant here by “translate” and “equivalent”? This is what we need to make 
explicit. For use in decidability issues in the next chapter it is important to 
do this translation in a way that depends only on L, k, L* and the formulas 
of L* that define 6(A) C B* and the sets 6(R“) and 5(graph(F“)) in B, for 
Re and F € L', but not on the structures A and B or on the map 6 that 
defines A in BG. It is not hard to do this, but the details are somewhat lengthy. 
(Fortunately, they are trivial to verify when fully written out.) We now proceed 
with these details. 


We first define a kind of copy Ly of the language L; it depends only on L and 
the natural number k. The symbols of the language DL, are the following: 


(a) a relation symbol U of arity k, 
(b) for each m-ary R € L* a relation symbol Ry, of arity mk, 
(c) for each n-ary F € L‘ a relation symbol F), of arity (n + 1)k. 


We insist that U is different from s, for each symbol s € L, and that different 
s € L give different s;,. For each variable x = vj, let 71,...,2, be the variables 
Vikt1s+++5Vjk+k, in this order. Next, we define a map y'> yx from the set of 
unnested L-formulas into the set of D;,-formulas such that if y has the form 
p(a@1,---;%n), then vy, will have the form ypz(@11,.--,21~,---,Lnis-+-,Lnk)- 
(Thus if x; happens to be the variable v,, then w,1,...,2;% are the variables 
Vjk+1)-++)Vjk+k, in this order, according to our convention.) The definition of 
this map y +> yx is by recursion on (unnested) formulas: 


(i) ify is T, then vy, is T, and if y is L, then yy, is 1; 
(ii) ify isa =y, then yz is 1 = yi A-+: A Xe = Ye} 
(iii) if yp is Ra,...2m with m-ary R € L’, then yz, is 
Rpt... UiR..-Lm1---Lmk3 
(iv) if pis Fx,...2, = y with n-ary F € L*, then gy, is 
Fyayy... Lig. -Ln1---EnkY1 ---Yk3 

(v) if y is =v, then yp, is 7w,; 
(vi) ifpis WV, then yz is Ue V Ox; 
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(vii) if pis PA, then yz is Ue A Ox; 


(viii) if y is daw, then yp, is da... darz, (Ux, a A Wk); 
(ix) if y is Vaw, then yz is Var... Vax (Ux, 1. 2k > Wr). 


In the rest of this section we assume that 5 : A — B* is a 0-definition of the 
L-structure A = (A;...) in the L*-structure B = (B;...). We arrange that L* 
and L;, are disjoint, and form the language Lz := L* U Ly. We now expand B 
to an Ly-structure 6, as follows: interpret U as 6(A), and Ry for m-ary R € L* 
as 6(R4), and Fy for n-ary F € L' as 6(graph(F)). A straigtforward induction 
gives: 


Lemma 4.5.5. For any unnested L-formula p(a1,...,%n) and a1,...,dn € A, 


AF 9(a1,---,Gn) <=> By FE yx, (5(a1),..-,5(an))- 


It is clear that By, is a definitional expansion of B. Explicitly, let U*(vi,...,ve) 
be an L*-formula that defines 6(A) in B; for each m-ary R € LT, let R*(v1,.--,Vmk) 
be an L*-formula that defines 6(R“) in B; for each n-ary F € L’, let 


F* (v4, ee Vin+1)k) 


be an L*-formula that defines 5(graph(F“)) in B. Then in B; the L,-formula 
Uv,...Vv% is equivalent to the L*-formula U*(v1,...,v%), for m-ary R € L* the 
L,-formula Rgvi ...Vmx is equivalent to the L*-formula R*(v1,...,Vmk), and for 
n-ary F € L', the L,-formula Fi,v, .. -Vin+1)k is equivalent to the L*-formula 
F*(v1,.--,V(n41)k)- So the defining axiom for U is 


Wy... Wr (Uvi ...V_R <> U*(vy,... VE) 
for m-ary R € L* the defining axiom for Ry, is 
Vv... Wink (Revi 1.eVmk <> R*(v1,..- eae) 
and for n-ary F € L‘ the defining axiom for F;, is 
V4... Won+t)k (Fev bi Mind hie <P Wigaen V(n+1)k)) + 


Let Def(d) be the set of Ly-sentences whose members are the defining axioms 
for U and the Ry and F; described above. These defining axioms are true in 
B;,, and define B; as an expansion of BL. 


Let y = 9(a1,...,2%n) be any L-formula. Now y is equivalent to an unnested 
L-formula, so Lemma 4.5.5 gives an D,-formula 


Pr = Pe(@11,---,21k,---)Lni,---)Lnk) 
such that for all aj,...,a, € A, 


AE 9(ai,.--,4n) = > BrE pr(d(a1),.--,6(Gn))- 
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Treating yr as an L;-formula we then obtain from Lemma 4.5.3 an L*-formula 
dp = (O~)(@11,---,L1k,-+-;Lni,-+-,Lnk) 
such that Def(d) + y, <> dy, and thus for all a1,...,@n € A, 
AE (ai,.--,dn) <> BE (dv)(d(a1),...,6(Gn)). 


Moreover, the map y +> y, depends only on L,k, and the map y +> dy depends 
only on L,k, L*, Def(d) (not on A, B, or 6: A > B*). Let us single out the case 
of sentences, and summarize what we have as follows: 


Lemma 4.5.6. To each L-sentence o is assigned an Ly-sentence ox such that 
AKoa Br - ox, and an L*-sentence 60 such that Def(d) - o, <> do. 
Since By, |= Def(d), this gives AE o B —& 60, for each L-sentence o. 


We shall also need that b;, satisfies certain L,-sentences that express: 
(i) the fact that U8* C B* is nonempty; 
(ii) for each m-ary R € L’ the fact that RE* C (UB*)™; 


(iii) for each n-ary F € L* the fact that the relation fee C Bw+D* is the 
graph of a function (U?*)" = Ue. 


For (i), take the sentence dv, ...4v,Uv,...v;,. To express (ii), take 


NAVAN ns “Wink (Revi ..:Vmk 2 Uvy...VE A+ A UV(m—1)k+1 wie Naik) 
We leave it to the reader to construct the sentences expressing (iii). Note that 
(i), (ii), and (iii) yield a set A(L,k) of Lx-sentences that depends only on L,k 
and not on A, B or 6: A > B®. This will play a role in the next Chapter via 
the following Lemma. 


Lemma 4.5.7. Let &* be a set of L*-sentences. Define % := set of L-sentences 
a such that &* U Def(d) U A(L,k) F ox. Then for all L-sentences o, 


Beko => YU Def(d) UV A(L,k) F og. 


Proof. The direction <= holds by the definition of 4. For the converse, let o is 
an L-sentence such that “* U Def(d) U A(L,k) 7 ox; it is enough to show that 
ul‘ a. The Completeness Theorem provides an L7-structure D = (D;...) with 


D — &* U Def(5) UA(L, k) U {70%}. 


Then we define an L-structure C as follows: the underlying set C' of C is given by 
C =U” C D*, the interpretation in C of an m-ary R € L* is the set RP viewed 
as an m-ary relation on C, and the interpretation in C of an n-ary F € L! is the 
function C” + C whose graph is F? when the latter is viewed as an (n +1)-ary 
relation on C. By construction the inclusion map C' <> D* is a 0-definition of C 
in the L*-reduct of D, and so for any L-sentence p we haveC EF p => DE pg. 
It follows that C EK UU {7c}, and so H/o, as promised. 


Chapter 5 


Computability, Decidability, 
and Incompleteness 


In this chapter we prove Gédel’s famous Incompleteness Theorem. Consider the 
structure Mt := (N; 0,.5,+,-,<), where S : N — N is the successor function. A 
simple form of the incompleteness theorem is as follows. 


Let % be a computable set of sentences in the language of N and true in Tt. 
Then there exists a sentence o in that language such that 3tE o, but UVa. 


In other words, no computable set of axioms in the language of St and true 
in Mt can be complete, hence the name Incompleteness Theorem. ! The only 
unexplained terminology here is “computable.” Intuitively, “SI is computable” 
means that there is an algorithm to recognize whether any given sentence in 
the language of Mt belongs to ©. (It seems reasonable to require this of an 
axiom system for St.) Thus we begin this chapter with developing the notion of 
computability. The interest of this notion is tied to the Church-Turing Thesis 
as explained in Section 5.2, and goes far beyond incompleteness. For example, 
computability plays a role in combinatorial group theory (Higman’s Theorem) 
and in certain diophantine questions (Hilbert’s 10th problem), not to mention 
its role in the ideological underpinnings of computer science. 


5.1 Computable Functions 


First some notation. We let ja(..x..) denote the least « € N for which ..«.. holds. 
Here ..x.. is some condition on natural numbers x. For example px(x? > 7) = 3. 
We will only use this notation when the meaning of ..z.. is clear, and the set 
{x EN : ..a..} is non-empty. For a € N we also let pxreg(..v..) be the least 
xz <ainWN such that ..«.. holds if there is such an a, and if there is no such x 
we put pr<q(..v..) := a. For example, urca(x? > 3) = 2 and pxee(x > 5) = 2. 


1A better name would have been Incompletability Theorem. 
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1 if 
Definition. For R C N”, we define xz : N" > N by xpr(a) = : ved 

0 ifa€g R. 
Think of such R as an n-ary relation on N. We call yr the characteristic 
function of R, and often write R(a,,...,@,) instead of (a1,...,dn) € R. 


Example. x<(m,n) = 1 iff m <n, and x<(m,n) = 0 iff m > n. 


Definition. For i = 1,...,n we define I? : N” > N by I?P(a1,...,@n) = a. 
These functions are called coordinate functions. 


Definition. The computable functions (or recursive functions) are the functions 

from N” to N (for n = 0,1, 2,...) obtained by inductively applying the following 

rules: 

(Rl) +: N? 3N,-:N? SN, y<:N? 3N, and the coordinate functions I” 
(for each n and i = 1,...,n) are computable. 

(R2) If G:N™ > N is computable and Hj,...,H,: N”" > N are computable, 
then so is the function F = G(Hj,..., Hm): N”" > N defined by 


F(a) = G(Hy(a),..., Hm(a)). 


(R3) If G: N"*! > N is computable, and for all a € N” there exists x € N 
such that G(a,z) = 0, then the function F : N" — N given by 


F(a) = px(G(a, x) = 0) 


is computable. 
A relation R C N” is said to be computable (or recursive) if its characteristic 
function xr : N” —> N is computable. 


Example. If F :N*?-—+N and G: N? +N are computable, then so is the 
function H : N* > N defined by H (x1, x2, 23,24) = F(G(x1, 24), %2,24). This 
follows from (R2) by noting that H(x) = F(G(I} (2x), [f(a)), I(x), I (x)) where 
xv = (%1,2%2,23,24). We shall use this device from now on in many proofs, but 
only tacitly. (The reader should of course notice when we do so.) 


From (R1), (R2) and (R3) we derive further rules for obtaining computable 
functions. This is mostly an exercise in programming. 


Lemma 5.1.1. Let HMy,...,Hm:N"” 3 N and RCN™ be computable. Then 
R(M,...,H,) CN” is computable, where fora € N” we put 


R(Hh,...,Hm)(a) <> R(Hi(a),..., Hm(a)). 


Proof. Observe that X R(#,,....H») = XR(1,---, Hm). Now apply (R2). 


Lemma 5.1.2. The functions x> and x= on N? are computable. 
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Proof. The function y> is computable because 
which enables us to apply (R1) and (R2). Similarly, y= is computable: 


x=(m,n) = x<(m,n) -x>(m,n). 


For k € N we define the constant function cf : N” > N by cf(a) =k. 
Lemma 5.1.3. Every constant function cf is computable. 
Proof. By induction on k. For k = 0 we use 


cf (a) = pa( I$} (a, x) = 0). 


For the step from k to k + 1, observe that 


chi (a) = wax(cg(a) < x) = p(s (eT (a,x), Inti (a,2)) = 0) 


fora € N”. 


Let P,Q be n-ary relations on N. Then we can form the n-ary relations 


aP:=N"\ P, PVQ:=PUQ, PAQ:=PNQ, 
P3Q:=(AP)VQ, PS Q:=(P3Q)A(Q-> P) 


on N. 


Lemma 5.1.4. Suppose P,Q are computable. Then ~P, PVQ, PAQ, PQ 
and P + Q are also computable. 

Proof. Let a € N”. Then —P(a) iff yp(a) = 0 iff xp(a) = ch (a), so yp(a) = 
x=(xp(a),c§(a)). Hence =P is computable by (R2) and Lemma 5.1.2. Next, 


the relation P A Q is computable since xp~g = xP - XQ. By De Morgan’s Law, 
PVQ=-7(-PA-7Q). Thus P V Q is computable. The rest is clear. 


Lemma 5.1.5. The binary relations <,<,=,>,>,4 on N are computable. 


Proof. The relations >, < and = have already been taken care of by Lemma 
5.1.2 and (R1). The remaining relations are complements of these three, so by 
Lemma 5.1.4 they are also computable. 


Lemma 5.1.6. (Definition by Cases) Let Ri,..., Ry, CN” be computable such 
that for each a € N” exactly one of Ri(a),...,Re(a) holds, and suppose that 
G,...,Ge :N”" > N are computable. Then G:N" > N given by 

Gi(a) if Ri(a) 

G(a) = : 
Gr(a) if Ri(a) 


is computable. 
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Proof. This follows from G = G1 - xr, +::: + Ge: XR,- 


Lemma 5.1.7. (Definition by Cases) Let Ri,...,R, CN” be computable such 
that for each a € N” exactly one of Ry(a),...,Ry(a) holds. Let P,,...,P, CN” 
be computable. Then the relation P C N” defined by 

Pi(a) if Ri(a) 

Pia) = : 
Pr(a) if Re(a) 


is computable. 


Proof. Use that P = (P, A Ri) V---V (Py A Rx). 


Lemma 5.1.8. Let R C N"*! be computable such that for all a € N” there 
exists x © N with (a,x) © R. Then the function F: N" 4 N given by 


F(a) = prR(a, 2) 


is computable. 


Proof. Note that F(a) = uwx(yr(a, x2) = 0) and apply (R3). 


Here is a nice consequence of 5.1.5 and 5.1.8. 


Lemma 5.1.9. Let F: N”° 4 N. Then F is computable if and only if its graph 
(a subset of N"*+) is computable. 


Proof. Let RC N+! be the graph of F. Then for alla € N” and be N, 


R(a,b) = > F(a) =), F(a) = uxrR(a,x), 


from which the lemma follows immediately. 


Lemma 5.1.10. If R C N+! is computable, then the function Fp : N°+! +N 
defined by Fr(a,y) = wr<,R(a, x) is computable. 


Proof. Use that Fr(a,y) = pr(R(a, x) or = y). 


Some notation: below we use the bold symbol J as shorthand for “there exists 
a natural number”; likewise, we use symbol V to abbreviate “for all natural 
numbers.” These abbreviation symbols should not be confused with the logical 
symbols J and V. 


Lemma 5.1.11. Suppose R C N”*+! is computable. Let P,Q C N”*! be the 
relations defined by 


Play) — Are, R(a,z) 
Q(a, y) = Vicy R(a, x), 


for (a,y) = (a1,---,@n,y) € N"*+. Then P and Q are computable. 
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Proof. Using the notation and results from Lemma 5.1.10 we note that P(a, y) 


iff Fr(a,y) < y. Hence yp(a,y) = x<(Fr(a,y),y). For Q, note that =Q(a, y) 
iff Sr<y7R(a, x). 


The reader should derive from Lemma 5.1.11 a variant that is often used: 


Corollary 5.1.12. Suppose R C N"*? is computable. Let P,Q CN”+! be the 
relations defined by 


P(a,y) = Atecy R(a,z,y) 
Q(a,y) = Vicy R(a,z,y), 


for (a,y) = (a1,.--,@n,y) € N"*+. Then P and Q are computable. 


a—b ifa>b, 


Lemma 5.1.13. The function — : N? + N defined by a—b = : 
0 ifa<b 


is computable. 


Proof. Use that a—b = pa(b +2 =a ora <b). 


The results above imply easily that many familiar functions are computable. 
But is the exponential function n +> 2” computable? It certainly is in the 
intuitive sense: we know how to compute (in principle) its value at any given 
argument. It is not that obvious from what we have proved so far that it is 
computable in our precise sense. We now develop some coding tricks due to 
Godel that enable us to prove routinely that functions like 2” are computable 
according to our definition of “computable function”. 


Definition. Define the function Pair : N? + N by 


1 
Pair(a, y) := eters + 2 


We call Pair the pairing function. 


Lemma 5.1.14. The function Pair is bijective and computable. 


Proof. Exercise. 


Definition. Since Pair is a bijection we can define functions 
Left, Right : N— N 


by 
Pair(z,y) =a <=> Left(a) = and Right(a) = y. 


The reader should check that Left(a), Right(a) < a for a € N, and Left(a) < a 
ifO0<aeEN. 


Lemma 5.1.15. The functions Left and Right are computable. 
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Proof. Use 5.1.9 in combination with 


Left(a) = pa (Ay<a+1 Pair(2,y) =a), 
Right(a) = py(Ar<a41 Pair(z,y) =a). 


For a,b,c € Z we have (by definition): a=b modc = > a—b€cZ. 
Lemma 5.1.16. The ternary relation a=b mod c onN is computable. 
Proof. Use that for a,b,c € N we have 

a=b modc => (Atca41 @=2U-c+bor Arey4,b=2-c+a). oO 
We can now introduce Gédel’s function 6 : N? > N. 


Definition. For a,i € N we let 6(a,7) be the remainder of Left(a) upon division 
by 14+ (+1) Right(a), that is, 


B(a,i) = pa(ax = Left(a) mod 1+ (i+ 1) Right(a)). 


Proposition 5.1.17. The function B is computable, and B(a,i) < a—1 for all 
a,tEN. For any ag,...,dn € N there exists a€ N such that 


B(a,0) = ao,..., B(a,n) = an. 
Proof. The computability of 6 is clear from earlier results. We have 
B(a,i) < Left(a) < a-1. 


Let ao,...,@n € N. Take N € N such that a; < N for alli<nand Nisa 
multiple of every prime number < n. We claim that then 


14+.N,14+2N,..., l+nN, 14+ (n+1)N 


are pairwise relatively prime. To see this, suppose p is a prime number such 
that p|1+¢N andp|14+jN (1<i<j<n+1); then p divides their difference 
(j —i)N, but p=1 mod N, so p does not divide N, hence p| 7 —i <n. But 
all prime numbers < n divide N, and we have a contradiction. 

By the Chinese Remainder Theorem there exists an M € N such that 


M = ao mod1+N 
M = a mod1+2N 
M = a mod1l+(n+I1)N. 


Put a := Pair(M, N); then Left(a) = M and Right(a) = N, and thus 6(a,1) = 
a; as required. 
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Remark. Proposition 5.1.17 shows that we can use @ to encode a sequence of 


numbers ao,...,@», in terms of a single number a. We use this as follows to 
show that the function n +> 2” is computable. 
If ao,..-,@y are natural numbers such that ag = 1, and aj41 = 2a; for 


all 2 < n, then necessarily a, = 2”. Hence by Proposition 5.1.17 we have 
B(a,n) = 2” where 


a:= px(B(x,0) =1 and Wie, B(a,i+1) = 26(2,1)), 
that is, 
2” = Blan) = B(uax(B(z,0) =1 and VienB(z,i+ 1) = 28(z,1)),n). 
It follows that n +> 2” is computable. 


The above suggests a general method, which we develop next. To each se- 
quence (@1,...,@,) of natural numbers we assign a sequence number, denoted 
(a1,---,@n), and defined to be the least natural number a such that 8(a,0) =n 
(the length of the sequence) and G(a,i) = a; for i = 1,...,n. For n = 0 this 
gives () = 0, where () is the sequence number of the empty sequence. We de- 
fine the length function lh : N — N by lh(a) = @(a,0), so lh is computable. 
Observe that lh((a1,...,dn)) =n. 

Put (a); := 8(a,i+1). The function (a, i) + (a); : N? —> N is computable, 
and ((a1,..-,@n))i = @i41 for i < n. Finally, let Seq C N denote the set of 
sequence numbers. The set Seq is computable since 


a € Seq => Vr ea(lh(x) ¥ lh(a) or Fieniay()i F (@):) 


Lemma 5.1.18. For any n, the function (a1,...,@n) +> (a1,---,@n): N° 3 N 
is computable, and a; < (a1,...,Qn) for (a1,..-,@n) EN” andi =1,...,n. 


Proof. Use (a1,...,@n) = pa(G(a,0) = n, B(a,1) = a1,...,8(a,n) = ay), and 
apply Lemmas 5.1.8, 5.1.4 and 5.1.17. 


Lemma 5.1.19. We have computable binary operations In: N? —> N and 
*: N*2 +N such that for all ay,...,@m,b1,...,bn EN, 


In((a1,.--,@m),2) = (a1,...,a;) fori<m, 
(A1,---;@m) * (by,..-, On) = (A1,--+,@m,;61,..., bn). 


Proof. Such functions are obtained by defining 


In(a,i) = po(Ih(e) =i andWjci(); = (a)s), 
axb = px(lh(x) = lh(a) +1h(d) and Viana) (x)i = (@)i 
and Wj.<cin()(£)tn(a)+j = (0)3)- 
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Definition. For F: N°+! > N, let F: N"+! + N be given by 
F(a,b) = (F(a,0),...,F(a,b—1)) (a N®, BEN). 
Note that F(a,0) = () =0. 


Lemma 5.1.20. Let F: N°+! +N. Then F is computable if and only if F is 
computable. 


Proof. Suppose F is computable. Then F is computable since 
F(a, b) = pa(Ih(x) = b and Vic, (x); = F(a,i)). 


In the other direction, suppose F is computable. Then F is computable since 
F (a,b) = (F(a,b+1))o. O 


Given G : N"+? — N there is a unique function F : N"*+! — N such that 
F(a,b) = G(a,b, F(a, b)) (aE N”, DEN). 
This will be clear if we express the requirement on F' as follows: 
F'(a,0) = G(a,0, 0), F(a,b+1) = G(a,b+ 1, (F(a,0),..., F(a,))). 


The next result is important because it allows us to introduce computable func- 
tions by recursion on its values at smaller arguments. 


Proposition 5.1.21. Let G and F be as above and suppose G is computable. 
Then F is computable. 


Proof. Note that 
F (a,b) = px(Seq(x) and Ih(x) = b and Wizy(x); = G(a,i, In(a, #))) 


for alla € N” and b EN. It follows that F is computable, and thus by the 
previous lemma F' is computable. 


Definition. Let 4: N" > N and B: N"t? = N be given. Let a range over 
N”, and define the function F : N"*! > N by 


F(a,0) = A(a), 
F(a,b+1) = B(a,b, F(a,b)). 


We say that F is obtained from A and B by primitive recursion. 


Proposition 5.1.22. Suppose A, B, and F are as above, and A and B are 
computable. Then F is computable. 
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Proof. Define G: N"*+? + N by 


Gln (Tee vegas) deoe 
Clearly, G is computable. We claim that 
F(a,b) = G(a, b, F(a, b)). 
This claim yields the computability of F', by Proposition 5.1.21. We have 


F(a,0) = A(a) = G(a,0,F(a,0)), and 
F(a,b+1) = B(a,b,F(a,b)) = B(a,b,(F(a,b+1))s) 
= G(a,b+1,F(a,b+1)). 


I 


The claim follows. 


Proposition 5.1.21 will be applied over and over again in the later section on 
Gédel numbering, but in combination with definitions by cases. As a simple 
example of such an application, let G : N > N and H : N? > N be computable. 
There is clearly a unique function F : N? > N such that for all a,b € N 


F(a,b) = F(a,G(b)) if G(b) <b, 
se H(a,6) otherwise. 


In particular F'(a,0) = H(a,0). We claim that F is computable. 

According to Proposition 5.1.21 this claim will follow if we can specify a 
computable function K : N° — N such that F(a,b) = K(a,b, F(a,b)) for all 
a,b EN. Such a function Kk is given by 


K(a b c) _ (c)a() if G(b) a b, 
- H(a,b) otherwise. 


Exercises. 
(1) The set of prime numbers is computable. 


(2) The Fibonacci numbers are the natural numbers F,, defined recursively by Fo = 0, 
Fy, =1, and Fn+4e = Fn4i+ Fn. The function n> F, : N > N is computable. 


(3) If fi,...,fn : N™ — N are computable and X C N” is computable, then 
f-'(X) CN™ is computable, where f := (f1,...,fn):N™ > N”. 


(4) If f: N—N is computable and surjective, then there is a computable function 
g:N-N such that f og = idn. 


(5) If f : N > N is computable and strictly increasing, then f(N) C N is com- 
putable. 
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(6) All computable functions and relations are definable in Nt. 


(7) Let F: N” 3N, and define 
(FY): NAN, (F)(a) = F((a)o,.--,(@)n-1), 


so F(ai,...,@n) = (F)((a1,...,@n)) for all ai,...,an € N. Then F is com- 
putable iff (F) is computable. (Hence n-variable computability reduces to 1- 
variable computability.) 


Let ¥ be a collection of functions F': N” — N for various m. We say that F is 
closed under composition if for allG: N” > Ninf andall M,...,Hm:N"” ON 
in F, the function F = G(Mi,...,Hm) : N" > N is in F. We say that F is 
closed under minimalization if for every G: N"t* +N in F such that for all 
a € N” there exists « € N with G(a,x) = 0, the function F : N” > N given by 
F(a) = px(G(a,x) = 0) is in F. We say that a relation R C N” is in F if its 
characteristic function yr is in F. 

(8) Suppose F contains the functions mentioned in (R1), and is closed under com- 


position and minimalization. All lemmas and propositions of this Section go 
through with computable replaced by in F. 


5.2 The Church-Turing Thesis 


The computable functions as defined in the last section are also computable 
in the informal sense that for each such function F : N” — N there is an 
algorithm that on any input a € N” stops after a finite number of steps and 
produces an output F(a). An algorithm is given by a finite list of instructions, 
a computer program, say. These instructions should be deterministic (leave 
nothing to chance or choice). We deliberately neglect physical constraints of 
space and time: imagine that the program that implements the algorithm has 
unlimited access to time and space to do its work on any given input. 

Let us write “calculable” for this intuitive, informal, idealized notion of 
computable. The Church-Turing Thesis asserts 


each calculable function F : N > N is computable. 


The corresponding assertion for functions N" — N follows, because the re- 
sult of Exercise 7 in Section 5.1 is clearly also valid for “calculable” instead 
of “computable.” Call a set P C N calculable if its characteristic function is 
calculable. 

While the Church-Turing Thesis is not a precise mathematical statement, it 
is an important guiding principle, and has never failed in practice: any function 
that any competent person has ever recognized as being calculable, has turned 
out to be computable, and the informal grounds for calculability have always 
translated routinely into an actual proof of computability. Here is a heuristic 
(informal) argument that might make the Thesis plausible. 

Let an algorithm be given for computing F : N — N. We can assume 
that on any input a € N this algorithm consists of a finite sequence of steps, 
numbered from 0 to n, say, where at each step 7 it produces a natural number 
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a;, With ag = a as starting number. It stops after step n with a, = F(a). 
We assume that for each 7 < n the number a; is calculated by some fixed 
procedure from the earlier numbers ao,...,a;, that is, we have a calculable 
function G : N > N such that aj4; = G((ao,...,a;)) for alli < n. The 
algorithm should also tell us when to stop, that is, we should have a calculable 
PCN such that =P((ao,...,a;)) fori <n and P((ao,...,@n)). Since G and 
P describe only single steps in the algorithm for F’ it is reasonable to assume 
that they at least are computable. Once this is agreed to, one can show easily 
that F' is computable as well, see the exercise below. 

A skeptical reader may find this argument dubious, but Turing gave in 1936 
a compelling informal analysis of what functions F': N — N are calculable 
in principle, and this has led to general acceptance of the Thesis. In addition, 
various alternative formalizations of the informal notion of calculable function 
have been proposed, using various kinds of machines, formal systems, and so 
on. They all have turned out to be equivalent in the sense of defining the same 
class of functions on N, namely the computable functions. 

The above is only a rather narrow version of the Church-Turing Thesis, but 
it suffices for our purpose. There are various refinements and more ambitious 
versions. Also, our Church-Turing Thesis does not characterize mathematically 
the intuitive notion of algorithm, only the intuitive notion of function computable 
by an algorithm that produces for each input from N an output in N. 


Exercises. 

(1) Let G:N—WN and PCN be given. Then there is for each a € N at most one 
finite sequence ao,...,@n of natural numbers such that ao = a, for all i < n we 
have ai41 = G((ao,...,@i)) and 4P((ao,...,ai)), and P((ao,...,@n)). Suppose 
that for each a € N there is such a finite sequence ao,...,@n, and put F(a) := an, 
thus defining a function fF’: N > N. If G and P are computable, so is F’. 


5.3 Primitive Recursive Functions 


This section is not really needed in the rest of this chapter, but it may throw light 
on some issues relating to computability. One such issue is the condition in Rule 
(R3) for generating computable functions that for all a € N” there exists y € N 
such that G(a,y) = 0. This condition is not constructive: it could be satisfied 
for a certain G without us ever knowing it. We shall now argue informally that 
it is impossible to generate in a fully constructive way exactly the computable 
functions. Such a constructive generation process would presumably enable 


us to enumerate effectively a sequence of algorithms ag, a 1,Q2,... such that 
each a, computes a (computable) function f,, : N — N, and such that every 
computable function f : N —- N occurs in the sequence fo, f1, fo,..., possibly 


more than once. Now consider the function fajag : N — N defined by 


faiag(n) = fn(m) +1. 


Then faiag is clearly computable in the intuitive sense, but faiag A fn for all n, 
in violation of the Church-Turing Thesis. 
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This way of producing a new function faiag from a sequence (f,,) is called 
diagonalization.* The same basic idea applies in other cases, and is used in a 
more sophisticated form in the proof of Gédel’s incompleteness theorem. 

Here is a class of computable functions that can be generated constructively: 
The primitive recursive functions are the functions f : N” — N obtained in- 
ductively as follows: 


(PR1) The nullary function N° > N with value 0, the unary successor function 
S, and all coordinate functions J)’ are primitive recursive. 


(PR2) If G: N™ + N is primitive recursive and Hj,...,H, :N” > N are 
primitive recursive, then G(H1,..., Hm) is primitive recursive. 


(PR3) If F : N"*! — N is obtained by primitive recursion from primitive re- 
cursive functions G: N” —> N and H : N"*? +N, then F is primitive 
recursive. 


A relation R C N” is said to be primitive recursive if its characteristic function 
XR is primitive recursive. As the next two lemmas show, the computable func- 
tions that one ordinarily meets with are primitive recursive. In the rest of this 
section xz ranges over N™ with m depending on the context, and y over N. 


Lemma 5.3.1. The following functions and relations are primitive recursive: 
(i) each constant function cP; 
(ii) the binary operations +, -, and (x,y) > «¥ on N; 


(iii) the predecessor function Pd: N + N given by Pd(x) = x—1, the unary 
relation {x € N: 2 > 0}, the function —: N? +N; 


(iv) the binary relations >, < and = onN. 


Proof. The function c®, is obtained from ci, by applying (PR2) m times with 
G=S. Next, c”, is obtained by applying (PR2) with G = c®, (with k = 0 and 
t =n). The functions in (ii) are obtained by the usual primitive recursions. It is 
also easy to write down primitive recursions for the functions in (iii), in the order 


they are listed. For (iv), note that y>(a,y + 1) = xs0(2)- x>(Pd(a), y). 


Lemma 5.3.2. With the possible exceptions of Lemmas 5.1.8 and 5.1.9, all 
Lemmas and Propositions in Section 5.1 go through with computable replaced 
by primitive recursive. 


Proof. To obtain the primitive recursive version of Lemma 5.1.10, note that 
Fr(a,0)=0,  Fr(a,y+l) = Fr(a,y)-xr(@ Fr(a,y))++)-x-k(@, Fr(a,y))- 


A consequence of the primitive recursive version of Lemma 5.1.10 is the following 
restricted minimalization scheme for primitive recursive functions: 


?Perhaps antidiagonalization would be a better name. 
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if RC N"*! and H : N" > N are primitive recursive, and for all a € N” 
there exists x < H(a) such that R(a,x), then the function F : N” + N given 
by F(a) = pxR(a,x) is primitive recursive. 


The primitive recursive versions of Lemmas 5.1.11—5.1.16 and Proposition 5.1.17 
now follow easily. In particular, the function ( is primitive recursive. Also, the 
proof of Proposition 5.1.17 yields: 


There is a primitive recursive function B: N—-N such that, whenever 
n<N,ao<N,...,€n<N, (n,ao,.--,@n,N € N) 


then for some a < B(N) we have B(a,i) = a; fori =0,...,n. 


Using this fact and restricted minimalization, it follows that the unary relation 
Seq, the unary function lh, and the binary functions (a,i) + (a);, In and * are 
primitive recursive. 

Let a function F : N"°+! + N be given. Then F : N"*+! — N satisfies the 
primitive recursion F'(a,0) = 0 and F(a,b+ 1) = F(a,b) * (F(a,)). It follows 
that if F is primitive recursive, so is F. The converse is obvious. Suppose also 
that G : N"+? — N is primitive recursive, and F(a,b) = G(a,b, F(a, b)) for all 
(a,b) € N"*1; then F satisfies the primitive recursion 


F(A,0) = G(a,0,0), F(a,b+1) = F(a,b) * (G(a,b, F(a, b))). 


so F (and hence F’) is primitive recursive. 


The Ackermann Function. By diagonalization we can produce a computable 
function that is not primitive recursive, but the so-called Ackermann function 
does more, and plays a role in several contexts. First we define inductively a 
sequence Ap, Ay, Ag,... of primitive recursive functions A, : N > N: 


Ao(y) =y +1, An41(0) = An(1), 
An+yily Te 1 = An (An+i(y)). 


Thus Ap = S and A,410 Ap = Ano Anyi. One verifies easily that A;(y) = y+2 
and Ao(y) = 2y + 3 for all y. We define the Ackermann function A : N? + N 
by A(n,y) := An(y)- 


Lemma 5.3.3. The function A is computable, and strictly increasing in each 
variable. Also, for alln and x,y: 


(i) An(e@+y) = An(z) +9; 
(ii) n> 1 => Angily) > An(y) +9; 
(ili) An+i(y) 2 An(y + 1); 
(iv) 2An(y) < An+a(y); 

(v) @<y => An(w +) S Antal). 
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Proof. We leave it as an exercise at the end of this section to show that A is 
computable. Assume inductively that Ao,...,An are strictly increasing and 
Ao(y) < Ai(y) <--+ < An(y) for all y. Then 


Ansily +1) = An(An+ily)) = Ao(Anti(y)) > An+i(y); 


so Ay+1 is strictly increasing. Next we show that A,+i(y) > An(y) for all y: 
An+i(0) = An(1), so Anii(0) > A,(0) and A,+1(0) > 1, so Ansi(y) > yt1 
for all y. Hence An+i(y +1) = An(An+i(y)) > An(y + 1). 

Inequality (i) follows easily by induction on n, and a second induction on y. 

For inequality (ii), we proceed again by induction on (n,y): Using Ai(y) = 
y+2 and Ag(y) = 2y+3, we obtain Ao(y) > Ai(y)+y. Let n > 1, and assume 
inductively that An(y) > An—i(y) + y. Then Ay+1(0) = An(1) > Ap(0) +0, 
and 


Ansyily ae 1) = An(An+i(y)) 2 An(y +1+ An(y)) 
> An(y+1)+ Anly) > An(y+1)t+yt1. 


In (iii) we proceed by induction on y. We have equality for y = 0. Assuming 
inductively that (iii) holds for a certain y we obtain 


An+ily +1) = An(An+i(y)) 2 An(An(y + 1)) 2 An(y + 2). 
Note that (iv) holds for n = 0. For n > 0 we have by (i), (ii) and (iii): 


Note that (v) holds for n = 0. Assume (v) holds for a certain n. Let x < y+1. 
We can assume inductively that if « < y, then An+i(x + y) < An+sa(y), and we 
want to show that 


Ansi(a ty +1) < Ans3(y +1). 


Case 1. x = y. Then 


Anei(t +y +1) = Ansi(2t +1) = An(An4i(22)) 


< An+2(2x) < Anto(An+3(x)) = Ants(y + 1). 
Case 2. « < y. Then 


Anti(a + y +1) = An(Ansi(@ + y)) S Ant2(Anta(y)) = Antaly + 1). 


Below we put |z| := 2, +---+a,, for © = (a,...,%m) € N™. 


Proposition 5.3.4. Given any primitive recursive function F : N™ > N there 
is ann =n(F) such that F(x) < Ay(|2|) for alae N™. 
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Proof. Call an n = n(F’) with the property above a bound for F. The nullary 
constant function with value 0, the successor function S, and each coordinate 
function [7”, (1 <i < m), has bound 0. Next, assume F = G(Hj,..., Hj,) where 
G:N* +N and M,...,H, : N™ > N are primitive recursive, and assume 
inductively that n(G) and n(H,),...,n(H;,) are bounds for G and My,..., Hy. 
By part (iv) of the previous lemma we can take N € N such that n(G) < N, 
and )°, Hi(x) < An+1(|2|) for all «. Then 


F(x) = G(Ai(@),...,Hk(a)) < An(), Ai(@)) < An(Awnyi(|2])) S An+o([a]). 


Finally, assume that F : N+! — N is obtained by primitive recursion from 
the primitive recursive functions G : N™ — N and H : N+? — N, and assume 
inductively that n(G) and n(H) are bounds for G and H. Take N € N such 
that n(G) < N+3 and n(H) < N. We claim that N +3 is a bound for F: 
F(«,0) = G(a#) < An+3(|z]), and by part (v) of the lemma above, 


F(a,y+1) = (x,y, F(a, y)) < An{la| + y + Anya(l2l + y)} 
< An+o{An+a(le] + y)} = Ansa(l2l + y+ 1). 


Consider the function A* : N — N defined by A*(n) = A(n,n). Then A* 
is computable, and for any primitive recursive function F : N — N we have 
F(y) < A*(y) for all y > n(F), where n(F) is a bound for F’. In particular, A* 
is not primitive recursive. Hence A is computable but not primitive recursive. 

The recursion in “primitive recursion” involves only one variable; the other 
variables just act as parameters. The Ackermann function is defined by a re- 
cursion involving both variables: 


A(O,y)=y+1, A(et+1,0)=A(a,1), A(et+1,y+1) = Aa, A(x +1,y)). 


This kind of double recursion is therefore more powerful in some ways than what 
can be done in terms of primitive recursion and composition. 


Exercises. 

(1) The graph of the Ackermann function is primitive recursive. (It follows that the 
Ackermann function is recursive, and it gives an example showing that Lemma 5.1.9 
fails with “primitive recursive” in place of “computable” .) 


5.4 Representability 


Let L be a numerical language, that is, LD contains the constant symbol 0 and 
the unary function symbol S. We let S”0 denote the term S...S0 in which S 
appears exactly n times. So 9°0 is the term 0, $10 is the term 50, and so on. 
Our key example of a numerical language is 


L(N) := {0,5,+,:,<} (the language of Jt). 
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Here N is the following set of nine axioms, where we fix two distinct variables 
x and y for the sake of definiteness: 
Nl Va(Sa 40) 

2 VaVy(Sx=Sy>ur=y) 

N38 Va(a#+0=2) 

N4 Vay (a+ Sy = S(x +y)) 

N5 Va(a#-0=0) 

N6 VaVy(a-Sy=a-y+2) 

N7 Va(a £0) 

N8 VaVy(u< Syour<yVr=y) 
NO VaVy(a<yVa=yVy<2) 


These axioms are clearly true in %t. The fact that N is finite will play a role 
later. It is a very weak set of axioms, but strong enough to prove numerical 
facts like 


SS04+ S550 = SSSSSO0, Va(a < SSO > (« =0Va2=S80)). 
Lemma 5.4.1. For each n, 
Nika <S"t'0 6 («=0V:-:-Va=S"0). 


Proof. By induction on n. Forn = 0, Nt « < S0 4 x =0 by axioms N8 and 
N7. Assume n > 0 andNt x < $706 (© =0V-:-Va2=8$"~10). Use axiom 
N8 to conclude that NF a < S"*10 6 (t@ =0V---V a= 870). 


To give an impression how weak N is we consider some of its models: 


Some models of N. 

(1) We usually refer to 9 as the standard model of N. 

(2) Another model of N is Ma] := (N[a]; ...), where 0,5,+,- are interpreted 
as the zero polynomial, as the unary operation of adding 1 to a polynomial, 
and as addition and multiplication of polynomials in N[z], and where < is 
interpreted as follows: f(x) < g(x) iff f(n) < g(n) for all large enough n. 

(3) A more bizarre model of N: (R2°; ...) with the usual interpretations of 
0,$,+,-, in particular S(r) :=r+1, and with < interpreted as the binary 
relation <j on R2°: r<njs@(r,s€Nandr<s)ors¢N. 


Example (2) shows that N I Vady(a = 2y V x = 2y+ 50), since in N{x] the 
element x is not in 2N[a]U 2N[az] + 1; in other words, N cannot prove “every 
element is even or odd.” In example (3) we have 1/2 <n 1/2, so the binary 
relation <j on R2? is not even a total order. One useful fact about models of 
N is that they all contain the so-called standard model Nt in a unique way: 


Lemma 5.4.2. Suppose A/F N. Then there is a unique homomorphism 
L: MOA. 


This homomorphism t is an embedding, and for alla € A and all n, 
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(i) ifa<Au(n), thena=1(m) for some m <n; 
(ii) ifa ¢u(N), then u(n) <4 a. 


As to the proof, note that for any homomorphism . : % — A and all n we 
must have 1(n) = ($"0)4. Hence there is at most one such homomorphism. It 
remains to show that the map n> ($"0)4 : N > A is an embedding v : N > A 
with properties (i) and (ii). We leave this as an exercise to the reader. 


Definition. Let L be a numerical language, and © a set of L-sentences. A rela- 
tion R C N” is said to be U-representable, if there is an L-formula y(21,..., 0m) 
such that for all (a1,...,@m) € N™ we have 

(i) R(ai,..-,@m) => VE p(S™0,...,.5%"0) 

(ii) 7AR(a1,...,@m) => UE 7y($S™0,...,$%"0) 

Such a y(21,...,@m) is said to represent R in X or to U-represent R. Note that if 
p(@1,.-.;%m) S-represents R and © is consistent, then for all (a1,...,@m) € N™ 


R(a,.--,@m) => UE y($™0,..., $0), 
AR(a,...,@m) => UE ay(S™0,...,.$%"0). 


A function F : N™ — N is }-representable if there is a formula y(21,...,;2%m,Y) 
of Z such that for all (a1,...,@m) € N™ we have 


TE y(S™0,...,590,y) o y = SFG amJQ, 
Such a y(21,..-,2%m,Y) is said to represent F' in % or to 4-represent F’. 


An L-term t(x1,...,%m) is said to represent the function F : N™ > N in © if 
Sb t($%0,...,5%0) = SO for all a = (a1,...,a4m) € N™. Note that then 
the function F' is 4-represented by the formula t(a1,...,%m) = y. 


Proposition 5.4.3. Let L be a numerical language, 4 a set of L-sentences such 
that 3 SO40, and RCN” a relation. Then 


R is S-representable = > yp is u-representable. 


Proof. (<=) Assume xr is U-representable and let y(w1,...,%m,y) be an L- 
formula \-representing it. We show that w(a1,...,0%m) := 9(@1,..-,%m, 50) 
b-represents R. Let (a1,...,@m) € R; then yr(ai,...,@m) = 1. Hence 


LSE v($0,...,8°"0,y) & y = SO, 


so UF y($%0,...,$%"0, $0), that is, HF J(S0,...,$%"0). Likewise, but 
now using also ©} + SO 4 0, we show that if (a1,...,@m) ¢ R, then © + 
ap($%0,...,$°"0, SO). 

(=) Conversely, assume R is S-representable and let W(21,...,%m) be an 
L-formula =-representing it. We show that y(11,...,2%m,y) given by 


p(1,---;L2m;,Y¥) = (W(21,..-,2m) Ay = 80) V (aUtiy.44 poy AGO) 
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b-represents yz. Let (a1,...,@m) € R. Then UF w($0,...,5%"0), hence 
HE [(w(S%0,...,8°"0) Ay = $0) V (=w(S0,...,8°"0) Ay = 0)] & y = SO, 


that is, HF p(S%0,...,5°"0,y) & y = S0. Likewise, for (a1,...,@m) ¢ R, we 
obtain UF y($0,...,5°"0,y) o y = 0. 


Theorem 5.4.4 (Representability). Each computable function F : N” > N is 
N-representable. Each computable relation R C N™ is N-representable. 


Proof. By Proposition 5.4.3 we need only consider the case of functions. We 

make the following three claims: 

(R1)’ +:N?5N,-:N?5N, yx<:N?N, and the coordinate function 
I? (for each n and i = 1,...,n) are N-representable. 

(R2)’ IfG:N” > Nand M,...,Hm:N”" > N are N-representable, then so 
is F = G(M,..., Hm): N” > N defined by 


F(a) = G(Aj(a),..., Hy (a)). 


(R3)’ If G: N"t! 4 N is N-representable, and for all a € N” there exists 
x €N such that G(a,x) = 0, then the function F : N” > N given by 


F(a) = px(G(a, x) = 0) 


is N-representable. 
(R1)’: The proof of this claim has six parts. 


(i) The formula x; = x2 represents {(a,b) € N? : a =} in N: 
Let a,b € N. If a = b, then obviously N+ S“0 = $°0. Suppose that 
a # b. Then for every model A of N we have A | $70 4 9°0, by 
Lemma 5.4.2 and its proof. Hence N+ $70 4 $°0. 


(ii) The term x; + 22 represents + : N? > N in N: 
Let a+b =c where a,b,c € N. By Lemma 5.4.2 and its proof we 
have A S70 + $°0 = $°0 for each model A of N. It follows that 
Nt $20 + $°0 = S°0. 


(iii) The term 21: x2 represents -: N? > N in N: 
The proof is similar to that of (ii). 


(iv) The formula 21 < x2 represents {(a,b) € N?: a < b} in N: 
The proof is similar to that of (i). 

(v) x<:N? > N is N-representable: 
By (i) and (iv), the formula 11 < 2 V 41 = 22 represents the set 
{(a,b) € N?: a < b} in N. So by Proposition 5.4.3, y< : N? > N is 
N-representable. 

(vi) Forn >land1<i<n, the term t?(x,...,@n) := %; , represents 
the function J? : N" > N in N. This is obvious. 
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(R2)’: Let 21,...,2n,Y1,---;Ym,% be distinct variables, let G: N”™ — N be 


N-represented by W(yi,.--, Ym, 2), and let H;: N” > N be N-represented 
by yi(a1,.--,;%n, ys) for i= 1,...,m. 
Claim: F = G(M,...,Hm) is N-represented by 


m 


O(21, ors itn, 2) = YL eee Fym((/\ pi(x1, poi Uns Vil) A vy, ees rin 2) 
w=1 


Put a = (a1,...,@pn) and let c = F(a). We have to show that 
Nt 6(S°0, z) @ z= S°0, where $“0 abbreviates $7'0,...,5°"0. 


Let b; = H;(a) and put b = (b1,...,bm). Then F(a) = G(b) = c. There- 
fore, N+ w(S°0, z) o z = S°O and 


NF y;($70,y:) 6 ys = S%0, (i =1,...,m) 


Argue in models to conclude : Nt 6($°0,z) @ z = S°0. 


(R3)': Let G: N"*! > N be such that for all a € N” there exists b € N with 


G(a,b) = 0. Define F: N” >N by F(a) = pb(G(a,b) = 0). Suppose 
that G is N-represented by y(a1,...,2n,y,2). We claim that the formula 


W(@1,---,2n,Y) = 9(1,---, Ln, y, 0) AVw(w < y > 7 (@1,.--,2n, w,0)) 


N-represents fF’. Let a € N” and let b = F(a). Then G(a,i) £0 for i < b 
and G(a,b) =0. Therefore, N + y(S*0,.$°0,z) + z = 0 and for i < b, 
G(a,i) £0 and Nt (S20, S*0, z) G z = SO(%90. By arguing in models 
using Lemma 5.4.2 we obtain N+ 7($70,y) < y = $°0, as claimed. 


Remark. The converse of this theorem is also true, and is plausible from the 
Church-Turing Thesis. We shall prove the converse in the next section. 


Exercises. In the exercises below, L is a numerical language and © is a set of 
L-sentences. 


(1) 


(2) 


Suppose © + S’0 # S"0 whenever m #4 n. If a function F : N™ > N is &- 
represented by the L-formula y(#1,...,%m,y), then the graph of F’, as a relation 
of arity m+1 on N, is /-represented by y(x1,...,%m,y). (This result applies to 
“=N, since Nt $0 4 S"0 whenever m # n.) 


Suppose © D> N. Then the set of all /-representable functions F' : N” > N, 
(m = 0,1,2,...) is closed under composition and minimalization. 


5.5 Decidability and Godel Numbering 


Definition. An L-theory T is a set of L-sentences closed under provability, that 
is, whenever TF o, thena € T. 
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Examples. 

(1) Given a set © of L-sentences, the set Th(X) := {0 : Ut co} of theorems 
of %, is an L-theory. If we need to indicate the dependence on L we write 
Thz (=) for Th(X). We say that © axiomatizes an L-theory T (or is an 
axiomatization of T) if T = Th(X). For © = 9 we also refer to Thy (X) as 
predicate logic in L. 

(2) Given an L-structure A, the set Th(A) := {0 : A — co} is also an L- 
theory, called the theory of A. Note that the theory of A is automatically 
complete. 

(3) Given any class K of L-structures, the set 


Th(K) :={o0: Ako for all Ac kK} 


is an L-theory, called the theory of K. For example, for L = Dring, and K 
the class of finite fields, Th(X) is the set of L-sentences that are true in all 
finite fields. 


The decision problem for a given L-theory T is to find an algorithm to decide for 
any L-sentence o whether or not o belongs to T. Since we have not (yet) defined 
the concept of algorithm, this is just an informal description at this stage. One 
of our goals in this section is to define a formal counterpart, called decidability. 
In the next section we show that the L(N)-theory Th(9) is undecidable; by the 
Church-Turing Thesis, this means that the decision problem for Th(t) has no 
solution. (This result is a version of Church’s Theorem, and is closely related 
to the Incompleteness Theorem.) 

In the rest of this chapter the language L is assumed to be finite unless we say 
otherwise. This is done for simplicity, and at the end of this section we indicate 
how to avoid this assumption. We shall number the terms and formulas of L 
in such a way that various statements about these formulas and about formal 
proofs in this language can be translated effectively into equivalent statements 
about natural numbers expressible by sentences in L(N). 

Recall that vo,vi,Vv2,... are our variables. We assign to each symbol 


8 € {vo, V1, V2,---} U {logical symbols} U L 


a symbol number SN(s) € N as follows: SN(v;) := 2% and to each remaining 
symbol, in the finite set {logical symbols}U DL, we assign an odd natural number 
as symbol number, subject to the condition that different symbols have different 
symbol numbers. 


Definition. The Gédel number '¢' of an L-term t is defined recursively: 


r= (SN(v;)) ift =v, 
SSN bea BOY SIPS Petes 
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The Godel number "vy! of an L-formula y is given recursively by 


(SN(T)) ifp=T, 
(SN(L)) ify=L, 
(SN(=),"t17, to") if y = (t; = te), 
(SN(R),"t17,..., tm") ify = Rty...tm, 
"p'= 4 (SN(>),"¢)) if p = %, 
(SN(V),"¢1 "Ye |) if p = 91 V $2, 
(SN(A),"¢1 4, Y2 ') if p = 91 A $2, 
(SN(3), 027, 7) ify =Ary, 
(SN(V), 0 a7, 7) ify =Vaw. 


Lemma 5.5.1. The following subsets of N are computable: 
(1) Vble:= {Ta : x is a variable} 

(2) Term := {"t™: t ts an L-term} 

(3) AFor := {"y": y ts an atomic L-formula} 

(4) For:= {"py!: is an L-formula} 


Proof. (1) a € Vble iff a = (20) for some b < a. 

(2) a € Term iff a € Vble or a = (SN(F),"t17,...,"¢tn ') for some function 
symbol F of L of arity n and L-terms t),...,t, with Godel numbers < a. 

We leave (3) to the reader. 


For((a)1) if a = (SN(-), (a)1), 
For((a)1) and For((a)2) — if a = (SN(V), (a)1, (a)2) 
or a = (SN(A), (a)1, (@)2), 


NST ANS ONG) 2) srsia((a) aiden (ala): a= ENG) 


or a = (SN(V), (@)1, (@)2), 
AFor(a) otherwise. 


So For is computable. 


In the next two lemmas, x ranges over variables, y and w over L-formulas, and 
t and 7 over L-terms. 


Lemma 5.5.2. The function Sub : N° > N defined by Sub(a, b,c) = 


c if Vble(a) and a = b, 

((a)o, Sub((a)1, b,c),...,Sub((a)n,b,c)) ifa= ((a)o,..-,(@)n) with n > 0 and 
(a)o ASN(A), (@)o A SN(V), 

5), (a)1,(@)2) and (a): # 6, 

V),(a)1,(@)2) and (a) #6, 


a otherwise 
is computable, and satisfies 


Sub(t)>,"at,"r!) ="t(r/x)" and Sub(y),"at,"7’) =" p(r/z)". 
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Proof. Exercise; see also exercise (2) of Section 2.5. 


Lemma 5.5.3. The following relations on N are computable: 

(1) PrAx:={"y": is a propositional axiom} CN 

) Eq:={"y!: ¢ is an equality axiom} CN 

) Fr:={("y1,"2)) : & occurs free in p} C N? 

) FrSub:= {("p1,"a1,"7") : 7 is free for x in yp} C N3 

) Quant := {"U1: wv is a quantifier aziom} CN 

) MP := {("g11,"~1 3 Y21," v2") : %1, 2 are L-formulas} C N3 

) Gen := {("~1,"w") : w follows from » by the generalization rule} C N? 
) Sent :={"y!: ~ is a sentence} CN 


Proof. This is a lengthy, tedious, but routine exercise. The idea is to translate 
the usual inductive or explicit description of the relevant syntactic notions into 
a description of its “Gédel image” that establishes computability of this image. 
For example, when y = 1 V 2, one can use facts like: x occurs free in y iff x 
occurs free in y, or in Yo; and 7 is free for x in y iff 7 is free for x in y, and 
7 is free for x in yg. As usual, the main inductive steps concern terms, atomic 
formulas, and formulas that start with a quantifier symbol. See also exercise 
(1) of Section 2.7. As to (8), we have 


Sent(a) <= > For(a) and Vieg7 Fr(a, ¢), 


so (8) follows from (1) and earlier results. 


In the rest of this Section © is a set of Z-sentences. Put 
Tyt:= {Tol : o € oF, 
and call © computable if "X77 is computable. 
Definition. Prfys is the set of Gddel numbers of proofs from ™, that is, 
Prfy := {("g14,.--," Gn") + Y1,--+;Yn is a proof from S}. 


So every element of Prfy is of the form ("yi',...,"Yn ') where n > 1 and 
every yx is either in \, or a logical axiom, or obtained from some ¥;, 9; with 
1 < i,j < k by Modus Ponens, or obtained from some y; with 1 < i < k by 
Generalization. 


Lemma 5.5.4. If % is computable, then Prfy is computable. 


Proof. This is because a is in Prfy iff Seq(a) and lh(a) 4 0 and for every k < 
lh(a) either (a), € "O'U PrAxUEqU Quant or di, j < k : MP((a);, (@);, (@)x) 
or di < k: Gen((a);, (a)x). 


Definition. An L-theory T is said to be computably aziomatizable if T has a 
computable axiomatization.? 


3Instead of “computably axiomatizable,” also “recursively axiomatizable” and “effectively 
axiomatizable” are used. 
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We say that T is decidable if Tis computable, and undecidable otherwise. 
(Thus “T’ is decidable” means the same thing as “T is computable,” but for 
L-theories “decidable” is more widely used than “computable” .) 


Definition. A relation R C N” is said to be computably generated if there is a 
computable relation Q C N”+! such that for all a € N” we have 


R(a) = ArQ(a, x) 
“Recursively enumerable” is also used for “computably generated.” 


Remark. Every computable relation is obviously computably generated. We 
leave it as an exercise to check that the union and intersection of two computably 
generated n-ary relations on N are computably generated. The complement of 
a computably generated subset of N is not always computably generated, as we 
shall see later. 


Lemma 5.5.5. If % is computable, then Th(X)" is computably generated. 


Proof. Apply Lemma 5.5.4 and the fact that for alla ¢ N 


a €Th(Z) 1 <=> Jb(Prfy(b) and a = (6);,(5)-1 and Sent(a)). 


Proposition 5.5.6 (Negation Theorem). Let AC N” and suppose A and =A 
are computably generated. Then A is computable. 


Proof. Let P,Q C N+! be computable such that for all a € N” we have 
A(a) = > ArP(a, 2), AA(a) = > ArQ(a, x). 


Then there is for each a € N” an x € N such that (P V Q)(a,x). The com- 
putability of A follows by noting that for all a €¢ N” we have 


A(a) = > P(a, pa(P V Q)(a,x)). 


Proposition 5.5.7. Every complete and computably axiomatizable L-theory is 
decidable. 


Proof. Let T be a complete L-theory with computable axiomatization ©. Then 
"T'="Th(X)" is computably generated. Now observe: 


ag'T! <a ¢ Sent or (SN(-),a) €°T" 
<=> a ¢ Sent or 4b(Prfy(b) and (b)inwyia = (SN(-), a)). 


Hence the complement of "T"' is computably generated. Thus T is decidable by 
the Negation Theorem. 
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Representability implies Computability. We prove here the converse of 
the Representability Theorem, as promised at the end of Section 5.4. In this 
subsection we assume that L is numerical. 


Lemma 5.5.8. The function Num: N > N defined by Num(a) = "S70" is 
computable. 


Proof. Num(0) ="07 and Num(a + 1) = (SN(S), Num(a)). 


Thus, given an L-formula v(x), the function 
a > Ty(S°0)1 = Sub("y',"x2', Num(a)) 


is computable; this should also be intuitively clear. Such computable functions 
will play an important role in what follows. 


Proposition 5.5.9. Suppose & is a computable consistent set of L-sentences. 
Then every %-representable U C N” is computable. 


Proof. The case n = 0 is trivial, so let n > 1. Suppose y(a1,...,@,) is an 
L-formula that “-represents U C N”. As © is consistent, we have 


U(ai,...,@n) <=> Uk v(S0,...,S°"0) (a1,.--,@n € N). 


Define s: N” > N by s(a1,...,@n) := "p(S%0,...,S°%"0)7. It is intuitively 
clear that s is computable, but here is a proof. For i= 1,...,n, define 


s;: N' ON, 8i(a1,...,a;) := "p(S0,...,5%0,x541,.--,2n) |. 
Then s1(a1) = Sub("y'," 211, Num(a1)), (a1 € N), so s1 is computable. Next, 
$i41(G1, +++, Qi, Qi+1) — Sub(s;(a1, wee , Qi), Pappy) Num(a;+1)) (1 < wc< n) 


for all ay,...,ai41 € N, so the computability of s; gives that of s;,,;. Thus 
5 = 8, is computable. By the first display we obtain that for all a € N”, 


U(a) => SF v(S*0) = s(a) €Th(d)7. 


Now © is computable, so" Th(X)1 C N is computably generated by Lemma 5.5.5. 
Take a computable R C N”*! such that for all 2 € N, 


vel Th(S)? —> AyR(z,y). 
Then by the above, for all ac N”, 
U(a) <=> AyR(s(a),y), 


exhibiting U as computably generated. By the definition of “X-representable” 
the complement —U is also %-representable, so =U is computably generated as 
well. Then by the Negation Theorem U is computable. 
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Corollary 5.5.10. Suppose L > L(N) and & is a computable consistent set of 
L-sentences. Then every %-representable function f: N° — N is computable. 


Proof. Suppose f: N” — N is }-representable. Then its graph is }-representable 
by Exercise (1) of Section 5.4, so this graph is computable by Proposition 5.5.9. 
Thus f is computable by Lemma 5.1.9. 


In view of the Representability Theorem, these two results give also a nice 
characterization of computability. Let a function f: N” — N be given. 


Corollary 5.5.11. f is computable if and only if f is N-representable. 


Relaxing the assumption of a finite language. Much of the above does 
not really need the assumption that LD is finite. In the discussion below we only 
assume that LD is countable, so the case of finite D is included. First, we assign 
to each symbol 

8 € {vo, V1, V2,-..} U {logical symbols} 


its symbol number SN(s) € N as follows: SN(v;) := 22, and for 


s= T, L, 7, V, A, =, J, V, respectively, put 
SN(s) = 1, 3, 5, 7, 9, 11, 18, 15, respectively. 


This part of our numbering of symbols is independent of LD. 


By a numbering of L we mean an injective function L + N that assigns to each 
s € L an odd natural number SN(s) > 15 such that if s is a relation symbol, 
then SN(s) = 1 mod 4, and if s is a function symbol, then SN(s) =3 mod 4. 
Such a numbering of LD is said to be computable if the sets 


SN(L) = {SN(s): s€L})CN, — {(SN(s), arity(s)) : s € L} CN? 


are computable. (So if L is finite, then every numbering of L is computable.) 
Given a numbering of L we use it to assign to each L-term t and each L- 
formula vy its Godel number 't! and "y', just as we did earlier in the section. 


Suppose a computable numbering of L is given, with the corresponding Gédel 
numbering of [-terms and L-formulas. Then Lemmas 5.5.1, 5.5.2, and 5.5.3 go 
through, as a diligent reader can easily verify. 

Let also a set © of L-sentences be given. Define "X7:= {"o!: 0 € XU}, and 
call © computable if "1 is a computable subset of N. We define Prfs to be the 
set of all Godel numbers of proofs from /, and for an L-theory T we define the 
notions of T being computably axiomatizable, T being decidable, and T' being 
undecidable, all just as we did earlier in this section. (Note, however, that the 
definitions of these notions are all relative to our given computable numbering 
of L; for finite L different choices of numbering of L yield equivalent notions of & 
being computable, T being computably axiomatizable, and T being decidable.) 
It is now routine to check that Lemmas 5.5.4, 5.5.5, Propositions 5.5.7, 5.5.9, 
and Corollary 5.5.10 go through. 
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Exercises. 
(1) Iff:N— Niscomputable and f(x) > x for all x € N, then f(N) is computable. 


(2) Let the set X C N be nonempty. Then X is computably generated iff there is a 
computable function f : N > N such that X = f(N). Moreover, if X is infinite 
and computably generated, then such f can be chosen to be injective. 


(3) Every infinite computably generated subset of N has an infinite computable sub- 
set. 


(4) A function fF: N” > N is computable iff its graph is computably generated. 


(5) Let a and b denote positive real numbers. Call a computable if there are com- 
putable functions f,g:.N—N such that for all n > 0, 


g(n) #0 and la — f(n)/9(n)| < 1/n. 
Then: 


(i) every positive rational number is computable, and e is computable; 


(ii) if a@ and b are computable, so are a + b, ab, and 1/a, and if in addition 
a > 6, then a — b is also computable; 


(iii) @ is computable if and only if the binary relation Ra on N defined by 
Ra(m,n) <= n>Oandm/n<a 


is computable. (Hint: use the Negation Theorem.) 


5.6 Theorems of Godel and Church 


In this section we assume that the finite language L extends L(N). 
Theorem 5.6.1 (Church). No consistent L-theory extending N is decidable. 
Before giving the proof we record the following consequence: 


Corollary 5.6.2 (Weak form of Gédel’s Incompleteness Theorem). Each com- 
putably axiomatizable L-theory extending N is incomplete. 


Proof. Immediate from 5.5.7 and Church’s Theorem. 


We will indicate in the next Section how to construct for any consistent com- 
putable set of Z-sentences 4 D N an L-sentence o such that & / o and UY 7c. 
(The corollary above only says that such a sentence exists.) 

For the proof of Church’s Theorem we need a few lemmas. 
Let P C A? be any binary relation on a set A. For a € A, we let P(a) C A be 
given by the equivalence P(a)(b) = P(a, 6). 


Lemma 5.6.3 (Cantor). Given any P C A?, its antidiagonal Q C A defined by 
Q(b) => >P(b, b) 


is not of the form P(a) for anya€ A. 
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Proof. Suppose Q = P(a), where a € A. Then Q(a) iff P(a,a). But by defini- 
tion, Q(a) iff aP(a, a), a contradiction. 


This is essentially Cantor’s proof that no f : A + $8(A) can be surjective. (Use 
P(a,b):= be f(a); then P(a) = f(a).) 


Definition. Let © be a set of L-sentences. We fix a variable x (e. g. x = vo) 
and define the binary relation P* C N? by 


P™ (a,b) = Sub(a, "x7, Num(b)) € "Th(Z)7 
For an L-formula y(#) and a ="y(x)", we have 
Sub("y(x)7,"27,"S’07) =" p(S"0)", 


sO 
P*(a,b) => Dt y(S°0). 


Lemma 5.6.4. Suppose % DN is consistent. Then each computable set X CN 
is of the form X = P™(a) for somea€N. 


Proof. Let X C N be computable. Then X is }-representable by Theorem 5.4.4, 
say by the formula y(z), i. e. X(b) > UF y(S°0), and =X(b) > Ht 7y(S"0). 
So X(b) & NF v($°0) (using consistency to get “=”). Take a = "y(x)7; then 
X(b) iff SF y(SO) iff P=(a,b), that is, X = P=(a). 


Proof of Church’s Theorem. Let © > N be consistent. We have to show that 
then Th(X) is undecidable, that is, "Th(X)" is not computable. Suppose that 
TTh(X)7 is computable. Then the antidiagonal Q™ C N of P™ is computable: 


be Q™ & (b,b) ¢ P™ & Sub(b,"27, Num(b)) ¢ "Th(S)7. 


By Lemma 5.6.3, Q™ is not among the P™(a). Therefore by Lemma 5.6.4, Q™ 
is not computable, a contradiction. This concludes the proof. 


By Lemma 5.5.5 the subset "Thy (N)7 of N is computably generated. But this 
set is not computable: 


Corollary 5.6.5. Th(N) and Th(@) (in the language L) are undecidable. 


Proof. The undecidability of Th(N) is a special case of Church’s Theorem. Let 
AN be the sentence N1 A ---AN9. Then, for any [-sentence a, 


Ntpo = Ory AN > 0, 
that is, for alla EN, 
a€'Th(N)? ==} aé Sent and (SN(V), (SN(7),"AN"),a) € "Th(O)". 


Therefore, if Th(@) were decidable, then Th(N) would be decidable; but Th(N) 
is undecidable. So Th(@) is undecidable. 
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We assumed in the beginning of this section that L D L(N), but the statement 
that Thz(@) is undecidable also makes sense without this restriction. For that 
statement to be true, however, we cannot just drop this restriction. For example, 
if L = {F} with F a unary function symbol, then Thz() is decidable. 

Readers unhappy with the restriction that LD is finite can replace it by the 
weaker one that L is countable and equipped with a computable numbering as 
defined at the end of Section 5.5. Such a numbering comes with corresponding 
notions of “computable” (for a set of L-sentences) and “decidable” (for an L- 
theory), and the above material in this section goes through with the same 
proofs. 


Discussion. We have seen that N is quite weak. A very strong set of axioms 
in the language L(N) is PA (1st order Peano Arithmetic). Its axioms are those 
of N together with all induction axioms, that is, all sentences of the form 


Va [(y(2,0) AVy (p(x, y) + v(x, Sy))) > Vy v(2,y)] 


where v(x, y) is an L(N)-formula, « = (41,...,%p), and Vx stands for V1... Van. 

Note that PA is consistent, since it has 9%t = (N; <,0,5,+,-) as a model. 
Also "PA™ is computable (exercise). Thus by the theorems above, Th(PA) is 
undecidable and incomplete. To appreciate the significance of this result, one 
needs a little background knowledge, including some history. 

Over a century of experience has shown that number theoretic assertions can 
be expressed by sentences of L(N), admittedly in an often contorted way. (That 
is, we know how to construct for any number theoretic statement a sentence 
of L(N) such that the statement is true if and only if St E o. In most cases we 
just indicate how to construct such a sentence, since an actual sentence would 
be too unwieldy without abbreviations.) 

What is more important, we know from experience that any established fact 
of classical number theory—including results obtained by sophisticated analytic 
and algebraic methods—can be proved from PA, in the sense that PA | o for the 
sentence o expressing that fact. Thus before Gédel’s Incompleteness Theorem 
it seemed natural to conjecture that PA is complete. (Did people realize at 
the time that completeness of PA, or similar statements, imply the decidability 
of number theory? This is not clear to me, but decidability of number theory 
would surely have been considered as astonishing. Part of the issue here is that 
notions of completeness and decidability were at the time, before Gédel, just 
in the process of being defined.) Of course, the situation cannot be remedied 
by adding new axioms to PA, at least if we insist that the axioms are true in 
x and that we have effective means to tell which sentences are axioms. In this 
sense, the Incompleteness Theorem is pervasive. 


5.7 A more explicit incompleteness theorem 


In Section 5.6 we obtained Gédel’s Incompleteness Theorem as an immediate 
corollary of Church’s theorem. In this section, we prove the incompleteness 
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theorem in the more explicit form stated in the introduction to this chapter. 


In this section L D> L(N) is a finite language, and & is a set of L-sentences. 
We also fix two distinct variables x and y. 


We shall indicate how to construct, for any computable consistent © D N, a 
formula y(x) of L(N) with the following properties: 


(i) NF y($™0) for each m; 
(ii) UV Vay(a). 


Note that then the sentence Vzy(z) is true in Xt but not provable from %. Here 
is a sketch of how to make such a sentence. Assume for simplicity that L = L(N) 
and 9t = U. The idea is to construct sentences a and o’ such that 

(1) NE ovo’; and (2)NEo’ —> Uo. 
From (1) and (2) we get WEo —> Ui/o. Assuming that It / 70 produces 
a contradiction. Hence o is true in 3, and thus(!) not provable from ™. 

How to implement this strange idea? To take care of (2), one might guess 
that o’ = Va-pr(x,$°7'0) where pr(z,y) is a formula representing in N the 
binary relation Pr C N? defined by 


Pr(m,n) <> m is the Godel number of a proof from © 


of a sentence with Godel number n. 


But how do we arrange (1)? Since o’ := Van7pr(a, S'7'0) depends on a, the 
solution is to apply the fixed-point lemma below to p(y) := Va7pr(a, y). 
This finishes our sketch. What follows is a rigorous implementation. 


Lemma 5.7.1 (Fixpoint Lemma). Suppose S DN. Then for every L-formula 


5 


p(y) there is an L-sentence o such that UL o © p(S"0) wheren=Toa". 


Proof. The function (a,b) + Sub(a,"x1,Num(b)) : N? > N is computable by 
Lemma 5.5.2. Hence by the representability theorem it is N-representable. Let 
sub(x1, 2, y) be an L(N)-formula representing it in N. We can assume that the 
variable x does not occur in sub(x1, 22, y). Then for all a,b in N, 


Nt sub($70, $°0,y) @ y= S°0, where c= Sub(a,"27,Num(b)) (1) 


Now let p(y) be an L-formula. Define 0(x):= Jy(sub(2, x,y) A p(y)) and let 
m='@(x)". Let o := 6($™0), and put n ="o'. We claim that 


HE oa & p($"0). 
Indeed, 
n=To!='0(S™0)'= Sub(" A(z) 7, "2 7,°S™01) = Sub(m, "x, Num(m)). 


So by (1), 
NE sub(S0, $70, y) & y = S"0 (2) 
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We have 
ao = 0(9""0) = Jy(sub($""0, $0, y) A p(y), 


so by (2) we get HE a & Jy(y = S"0A p(y)). Hence, SF o & p(S"0). 


Theorem 5.7.2. Suppose % D N is consistent and computable. Then there 
exists an L(N)-formula p(x) such that NF y($'"0) for each m, but UY Vap(z). 


Proof. Consider the relation Prs C N? defined by 


Prs(m,n) <> m is the Godel number of a proof from © 
of an L-sentence with Gédel number n. 


Since } is computable, Pry is computable. Hence Pry is representable in N. Let 
prs(x,y) be an L(N)-formula representing Pry in N, and hence in ©. Because 
» is consistent we have for all m, n: 


SF prs ($0, S"0) <> Pry(m,n) (1) 
“F aprs($""0, $"0) <= > -aPrs(m,n) (2) 


Let p(y) be the L(N)-formula Va7prs(z, y). Lemma 5.7.1 (with L = L(N) and 
© = N) provides an L(N)-sentence o such that N+ o © p(S'7'0). It follows 
that SF o © p(S'7'0), that is 


Dk o & Verprs(zx, $7 '0) (3) 


Claim: © bf o. Assume towards a contradiction that % - o; let m be the 
Godel number of a proof of o from Y, so Pry(m,"o"'). Because of (3) we also 
have © + Verpry(2,$"7'0), so © + apry($0,$'7'0), which by (2) yields 
=Pry(m,"o"), a contradiction. This establishes the claim. 

Now put y(x) := 7pry,(z,$°7'0). We now show : 


(i) NF y($'™0) for each m. Because ©  o, no m is the Gédel number 
of a proof of o from &. Hence =Pry(m,"a') for each m, which by the 
defining property of prs, yields N+ 7prs,($™0,.$'7'0) for each m, that is, 
Nt ($0) for each m. 


(ii) UV Vay(x). This is because of the Claim and UF o © Vry(x), by (3). 


Corollary 5.7.3. Suppose that % is computable and true in an L-expansion It 
of N. Then there exists an L(N)-formula p(x) such that NF w(S"0) for each 
n, but SUNY Vay(z). 

(Note that then Vry(z) is true in 9t* but not provable from U.) To obtain this 
corollary, apply the theorem above to ) UN in place of &. 


This entire section, including the exercises below, goes through if we replace the 
standing assumption that L is finite by the weaker one that L is countable and 
equipped with a computable numbering as defined at the end of Section 5.5. 
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Exercises. The results below are due to Tarski and known as the undefinability of 
truth. The first exercise strengthens the special case of Church’s theorem which says 
that the set of Godel numbers of L-sentences true in a given L-expansion St™ of St is 
not computable. Both (1) and (2) are easy applications of the fixpoint lemma. 


(1) Let St* be an L-expansion of Jt. Then the set of Godel numbers of L-sentences 
true in 9* is not definable in 1". 


(2) Suppose © D N is consistent. Then the set "Th()7 is not S-representable, and 
there is no truth definition for 4. Here a truth definition for \ is an L-formula 
true(y) such that for all L-sentences o, 


“Eo <> true(S"0), where n =". 


5.8 Undecidable Theories 


Church’s theorem says that any consistent theory containing a certain basic 
amount of integer arithmetic is undecidable. How about theories like Th(F1) 
(the theory of fields), and Th(Gr) (the theory of groups)? An easy way to prove 
the undecidability of such theories is due to Tarski: he noticed that if Nt is 
definable in some model of a theory T, then T is undecidable. The aim of this 
section is to establish this result and indicate some applications. In order not to 
distract from this theme by tedious details, we shall occasionally replace a proof 
by an appeal to the Church-Turing Thesis. (A conscientious reader will replace 
these appeals by proofs until reaching a level of skill that makes constructing 
such proofs utterly routine.) 

In this section, Z and L’ are finite languages, © is a set of Z-sentences, and 
d’ is a set of L'-sentences. 


Lemma 5.8.1. Let L CL’ and CD’. 
(1) Suppose %’ is conservative over 4. Then 


Thz() is undecidable = > Thy/(d’) is undecidable. 

(2) Suppose L = L’ and X' \ & is finite. Then 
Th(’) is undecidable = > Th(%) is undecidable. 

(3) Suppose all symbols of L’\ L are constant symbols. Then 

Th; (X) is undecidable <= > Thy/(X) is undecidable. 
(4) Suppose d' extends © by a definition. Then 

Thz() is undecidable <> Thr/(»’) is undecidable. 
Proof. (1) In this case we have for all a € N, 


ae PTh, (5) <=> ac Sent, andae Thy (d’)7. 
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It follows that if Thz/(X’) is decidable, so is Thy (). 
(2) Write ©’ = {o1,...,7an} U5, and put o/ := 0, A--- Aon. Then for each 
[-sentence o we have ©’ 6 => St o’ 40, 80 for alla ec N, 


aé€'Th(d’)’ = > aé Sent and (SN(V), (SN(-),0/1),a) €"Th(S) 1. 


It follows that if Th() is decidable then so is Th(X’). 

(3) Let co,...,C» be the distinct constant symbols of L’ \ L. Given any L’- 
sentence o we define the L-sentence o’ as follows: take k € N minimal such 
that o contains no variable v,,, with m > k, replace each occurrence of ¢; in o 
by ve+i for i = 0,...,7, and let y(vg,.--,Vk4n) be the resulting LZ-formula (so 
a = 9(Co,.--;Cn)); then o! := Ve... Wein Y(Ve,---;Ve-tn)- An easy argument 
using the completeness theorem shows that 


Sep o => Sez oa’. 


By the Church-Turing Thesis there is a computable function a> a’: N > N 
such that "o’'!="o™ for all L’-sentences o; we leave it to the reader to replace 
this appeal to the Church-Turing Thesis by a proof. Then, for all a € N: 


aél Thy (x)7 <=> ac Sentry, and ae “Thy (xy. 


This yields the < direction of (3); the converse holds by (1). 
(4) The => direction holds by (1). For the = we use an algorithm (see Sec- 
tion 4.5) that computes for each L’-sentence o an L-sentence o* such that 
&’ + o & o*. By the Church-Turing Thesis there is a computable function 
atya*:N—-N such that "o*!1 ='"o™ for all L/-sentences o. Hence, for all 
aé€N, 

ae Thy (d’)7 <> ac Senty anda* € Th, ()1. 


This yields the < direction of (4). 


Remark. We cannot drop the assumption L = L’ in (2): take L = 0, 4 = 9, 
L’ = L(N) and ©’ = 9. Then Thz/(’) is undecidable by Corollary 5.6.5, but 
Thz(X) is decidable (exercise). 


Definition. An L-structure A is said to be strongly undecidable if for every set 
& of L-sentences such that A; &, Th(%) is undecidable. 


So A is strongly undecidable iff every L-theory of which A is a model is unde- 
cidable. 


Example. St = (N; <,0,5,+,-) is strongly undecidable. To see this, let ¥ 
be a set of L(N)-sentences such that 3t / UH. We have to show that Th(X) is 
undecidable. Now 3t E NUN. By Church’s Theorem Th( UN) is undecidable, 
hence Th(*) is undecidable by part (2) of Lemma 5.8.1. 


The following result is an easy application of part (3) of the previous lemma. 
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Lemma 5.8.2. Let co,...,¢n be distinct constant symbols not in L, and let 
(A, ao,.--,;@n) be an L(co,...,Cn)-expansion of the L-structure A. Then 


(A, ao,.--,@n) is strongly undecidable => A is strongly undecidable. 


Theorem 5.8.3 (Tarski). Suppose the L-structure A is definable in the L*- 
structure B and A is strongly undecidable. Then B is strongly undecidable. 


Proof. (Sketch) By the previous lemma (with L* and B instead of L and A), 
we can reduce to the case that we have a 0-definition 6 : A > B* of A in B. 
As described at the end of Section 4.5 this allows us to introduce the finite 
languages L; and Ly = L, UL*, the Lj-expansion 6, of B, a finite set Def(d) 
of L;-sentences, and a finite set A(L,k) of L,-sentences. Moreover, 


Bx  Def(5) U A(L, k). 


Section 4.5 contains implicitly an algorithm that computes for any L-sentence 
o an Lp-sentence ox, and an L*-sentence do such that 


AEFo => Bb, = Ok; Def(d) F o, <> da. 


Let &* be a set of L*-sentences such that B | %*; we need to show that 
Thy«(X*) is undecidable. Define © as the set of L-sentences o such that 


&* U Def(d) U A(L, k) F ox. 
By Lemma 4.5.7 we have for all L-sentences o, 
Seo => Y* UDef(d) UA(L,k) Fog. 
Suppose towards a contradiction that Thy«(X*) is decidable. Then 
Th (* U Def(6) U A(L, k)) 


is decidable, by part (2) of Lemma 5.8.1, so we have an algorithm for deciding 
whether any given L;-sentence is provable from © *U Def(d)A(L, k), and by the 
above equivalence this provides an algorithm for deciding whether any given 
L-sentence is provable from ©. But A — %, so Thz(%) is undecidable, and we 
have a contradiction. 


Corollary 5.8.4. Th(Ring) is undecidable, in other words, the theory of rings 
is undecidable. 


Proof. It suffices to show that the ring (Z; 0,1,+,—,-) of integers is strongly 
undecidable. Using Lagrange’s theorem that 


N= {e?+0 +e? 4+d*: a,b,¢,d€ Z}, 


we see that the inclusion map N — Z defines Nt in the ring of integers, so by 
Tarski’s Theorem the ring of integers is strongly undecidable. 
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For the same reason, the theory of commutative rings, of integral domains, and 
more generally, the theory of any class of rings that has the ring of integers 
among its members is undecidable. 


Fact. The set Z C Q is 0-definable in the field (Q; 0,1,+,-—,-) of rational 
numbers. Thus the ring of integers is definable in the field of rational numbers. 


We shall take this here on faith. The only known proofs use non-trivial results 
about quadratic forms. The first of these proofs is due to Julia Robinson (late 
1940s). The second one is due to Jochen Koenigsmann, Annals of Mathematics 
183 (2016), 73-93, and yields definability of Z by a universal formula. 


Corollary 5.8.5. The theory Th(FI) of fields is undecidable. The theory of any 
class of fields that has the field of rationals among its members is undecidable. 


Exercises. The point of exercises (2) and (3) is to prove that the theory of groups 
is undecidable. In fact, the theory of any class of groups that has the group G of (3) 
among its members is undecidable. On the other hand, Th(Ab), the theory of abelian 
groups, is known to be decidable (Szmielew). 

In (2) and (3) we let a,b,c denote integers; also, a divides b (notation: a | b) if 
ax = b for some integer x, and c is a least common multiple of a and b ifa|c, b| c, 
and c | x for every integer x such that a| x and b| x. Recall that if a and b are not 
both zero, then they have a unique positive least common multiple, and that if a and 
b are coprime (that is, there is no integer x > 1 with x | a and x | b), then they have 
ab as a least common multiple. 


(1) Argue informally, using the Church-Turing Thesis, that Th(ACF) is decidable. 
You can use the fact that ACF has QE. 


(2) The structure (Z; 0,1,+, |) is strongly undecidable, where | is the binary relation 
of divisibility on Z. Hint: Show that if b+ a is a least common multiple of a and 
a+1, and b—ais a least common multiple of a and a—1, then b = a”. Use this to 
define the squaring function in (Z; 0,1,+,|), and then show that multiplication 
is 0-definable in (Z; 0,1,+, |). 


(3) Consider the group G of bijective maps Z > Z, with composition as the group 
multiplication. Then G (as a model of Gr) is strongly undecidable. Hint: let s 
be the element of G given by s(a) = «+1. Check that if g € G commutes with 
s, then g = s® for some a. Next show that 


a |b <=> s° commutes with each g € G that commutes with s°. 


Use these facts to specify a definition of (Z; 0,1,+,|) in the group G. 


(4) Let L = {F} have just a binary function symbol. Then predicate logic in L, that 
is, Th,(@), is undecidable. 


As to exercise (4), predicate logic in the language whose only symbol is a binary 
relation symbol is also known to be undecidable. On the other hand, predicate 
logic in the language whose only symbol is a unary function symbol is decidable. 


To 


do? 

improve titlepage 

improve or delete index 

more exercises (from homework, exams) 
footnotes pointing to alternative terminology, etc. 


brief discussion of classes at end of “Sets and Maps”? 


at the end of results without proof? 


brief discussion on P=NP in connection with propositional logic 


section(s) on boolean algebra, including Stone representation, Lindenbaum- 
Tarski algebras, etc. 


section on equational logic? (boolean algebras, groups, as examples) 


solution to a problem by Erdés via compactness theorem, and other simple 
applications of compactness 


include “equality theorem” , 


translation of one language in another (needed in connection with Tarski 
theorem in last section) 


more details on back-and-forth in connection with unnested formulas 


extra elementary model theory (universal classes, model-theoretic crite- 
ria for qe, etc., application to ACF, maybe extra section on RCF, Ax’s 
theorem. 


On computability: a few extra results on c.e. sets, and exponential dio- 
phantine result. 


basic framework for many-sorted logic. 


